| 1 |
20-URL limit is a hard API error |
test_5 r3, r4, r5 |
400 INVALID_ARGUMENT: "Number of urls to lookup exceeds the limit (21 > 20)". Zero URL content tokens consumed. Reproduced on all clean runs. |
Limit enforced at the API layer before retrieval. Not truncation or silent dropping. |
| 2 |
YouTube succeeds despite being documented as unsupported |
test_6 r1, r4, r5 |
URL_RETRIEVAL_STATUS_SUCCESS on all clean runs. Tool tokens: 1,525 / 1,584 / 1,570 - variance <4%. |
Documented limitation doesn’t reflect current behavior on gemini-2.5-flash as of March 2026. |
| 3 |
PDF retrieval fails consistently on a valid public PDF |
test_2 all 5 runs |
URL_RETRIEVAL_STATUS_ERROR on every run. Tool tokens: 119–126, minimal and consistent. PDF is a documented supported type. |
PDF retrieval fails reliably for this W3C URL. Follow-up needed with a different PDF source before drawing a firm conclusion. |
| 4 |
Google Docs fail at the retrieval layer, not the API layer |
test_7 r1, r4, r5 |
URL_RETRIEVAL_STATUS_ERROR with tool tokens 156–219. Request completes normally - no API-level rejection. |
Two distinct failure modes exist: API-layer rejection, hard error, zero tokens, as in test_5 vs. retrieval-layer failure, request completes, status recorded in metadata. |
| 5 |
JSON API endpoint retrieval is non-deterministic |
test_8 all runs |
URL_RETRIEVAL_STATUS_SUCCESS in r1 and r2 - ~2,490 tool tokens; URL_RETRIEVAL_STATUS_ERROR in r4 and r5 - 112–116 tool tokens. No change in endpoint or prompt between runs. |
The Gemini URL context tool’s handling of application/json responses from this endpoint is unreliable. Treat JSON API endpoints as non-deterministic until confirmed with a stable public endpoint. |
| 6 |
Tool tokens dominate cost at scale and are stable across runs |
test_1, test_3, test_4 r4 & r5 |
At 20 URLs, tool tokens = 111,326 on both r4 and r5 - 0% variance. At 5 URLs: 27,506–27,508. At 1 URL: 3,099–3,134. |
tool_use_prompt_token_count is reproducible to <1% across runs and accounts for ~98.6% of total cost at 20 URLs. Use it for cost estimation. |
| 7 |
url_context_metadata order is non-deterministic |
test_3, test_4 r4 & r5 |
Metadata order shuffled relative to input order on every run. Shuffle pattern itself varies between runs. |
Match results by retrieved_url string, not array index. |