Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another issue: Gemini can’t do tool calling and (forced) json output at the same time

If you want to use application/json as the specified output in the request, you can’t use tools

So if you need both, you either hope it gives you correct json when using tools (which many times it doesn’t). Or you have to do two requests, one for the tool calling, another for formatting

At least, even if annoying, this issue is pretty straightforward to get around



Back before structured outputs were common among model providers, I used to have a “end result” tool the model could call to get the structured response I was looking for. It worked very reliably.

It’s a bit of a hack but maybe that reliably works here?


You can definitely build an agent and have it use tools like you mention. That’s the equivalent of making 2 requests to Gemini, one to get the initial answer/content, then another to get it formatted as proper json

The issue here is that Gemini has support for some internal tools (like search and web scraping), and when you ask the model to use those, you can’t also ask it to use application/json as the output (which you normally can when not using tools)

Not a huge issue, just annoying


I think this might be also something to do with their super specific outputting requirements when you do use search (has to be displayed in predefined Google format).


Does any other provider allow that? what use cases are there for JSON + tool calling at the same time?


Please correct my likely misunderstanding here, but on the surface, it seems to me that "call some tools then return JSON" has some pretty common use cases.


Let's say you wanna build an app that gives back structured data after a web search. First a tool call to a search api. Then do some reasoning/summar/etc on the data returned by the tool. And finally return JSON.


OpenAI, Ollama, DeepSeek all do that.

And wanting to programmatically work with the result + allow tool calls is super common.


Suppose there's a pdf with lots of tables i want to scrape. I mention the pdf url in my message and with gemini's url context tool, i now have access to the pdf.

I can ask gemini to give me the pdf's content as a json and it complies most of the time. But at times, there's an introductory line like "Here's your json:". Those introductory lines interfere with programmatically using the output. They're sometimes there, sometimes not.

If I could have structured output at the same time as tool use, I can reliably use what gemini spits out as it'll be in a json, no annoying intro lines.


OpenAI




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: