Not dense to ask questions! There are two separate concepts in play: 1) Maintain...

apsurd · 2025-10-16T23:15:29 1760656529

Very helpful. It helps me better understand the specifics behind each call and response, the internal units and whether those units are sent and received "live" from the LLM or come from a traditional db or cache store.

I'm personally just curious how far, clever, insightful, any given product is "on top of" the foundation models. I'm not in it deep enough to make claims one way or the other.

So this shines a little more light, thanks!

ayewo · 2025-10-17T14:46:30 1760712390

This recent comment https://news.ycombinator.com/item?id=45598670 by @simonw really helped drive home the point that LLMs are really being fed an array of strings.

colordrops · 2025-10-17T00:07:36 1760659656

Why wouldn't you turn on prompt caching? There must be a reason why it's a toggle rather than just being on for everything.

TimMoore · 2025-10-17T01:00:24 1760662824

Writing to the cache is more expensive than a request with caching disabled. So it only makes economic sense to do it when you know you're going to use the cached results. See https://docs.claude.com/en/docs/build-with-claude/prompt-cac...

adastra22 · 2025-10-17T05:17:51 1760678271

When you know the context is a one-and-done. Caching costs more than just running the prompt, but less than running the prompt twice.