Any idea what "output token efficiency" refers to? Gemini Flash is billed by num...

Romario77 · 2025-09-25T20:03:55 1758830635

They provide the answer in less words (while still conveying what needed to be said).

Which is a good thing in my book as the models now are way too verbose (and I suspect one of the reasons is the billing by tokens).

minimaxir · 2025-09-25T18:36:01 1758825361

The post implies that the new model are better at thinking, therefore less time/cost spent overall.

The first chart implies the gains are minimal for nonthinking models.

kaspermarstal · 2025-09-25T20:03:23 1758830603

Models are less verbose, so produces fewer output tokens, so answers cost less.