My app does support Windows though, you can connect to OpenAI, Claude, OpenRouter, Azure and other 3rd party providers. Just running SOTA LLMs locally can be challenging.
I'm pretty satisfied with my linux nvidia gpu setup. I may not have as much memory on my card, but the speed is almost certainly competitive if not outright faster. Further there are lots of techniques that mitigate this issue like offloading/streaming layers in as needed, quantizing, etc.
It also handles actual training/finetuning better.
https://prompt.16x.engineer/
Should work well if you have 64G vRAM to run SOTA models locally.