Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Looks like Ollama is focusing more and more on non-local offerings. Also their performance is worse than say vLLM.

What's a good Ollama alternative (for keeping 1-5x RTX 3090 busy) if you want to run things like open-webui (via an OpenAI compatible API) where your users can choose between a few LLMs?



At work I've set up LibreChat + LlamaSwap + llama.cpp

200 weekly users :)


How do you deal with different users wanting to use different LLMs at the same time?


i heard about Llamaswap and vllm




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: