GAZE: Grounded Agentic Zero-shot Evaluation with Viewer-Level Tools and Literature Retrieval on Rare Brain MRI

·ArXiv cs.LG··

arXiv:2605.00876v1 Announce Type: new Abstract: Vision-language models (VLMs) read an image and produce text in a single forward pass, whereas radiologists typically inspect an image several times and consult the literature before writing a report. We introduce GAZE (Grounded Agentic Zero-shot Evaluation), a framework that lets a medical VLM work in this iterative way by calling viewer-level tools (zoom, windowing, contrast, edge detection) and two retrieval tools backed by the U.S. National Lib...

Read full article →

Related Articles

OpenAI’s o1 correctly diagnosed 67% of ER patients vs. 50-55% by triage doctors
donsupreme · Hacker News · 2mo ago
Accelerating Gemma 4: faster inference with multi-token prediction drafters
amrrs · Hacker News · 2mo ago
A couple million lines of Haskell: Production engineering at Mercury
unignorant · Hacker News · 2mo ago
New AI tutor achieves 0.71-1.30 SD effect size in Dartmouth course [pdf]
jonahbard · Hacker News · 42m ago
Show HN: I trained a language model that thinks the capital of Japan is Paris
farisallafi · Hacker News · 11h ago