Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
krackers
8 months ago
|
parent
|
context
|
favorite
| on:
Why DeepSeek is cheap at scale but expensive to ru...
Yeah this part was confusing, because it's only mentioned halfway through the article that the attention step can only be batched across matching context-window sizes.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: