
3·
3 days agoI haven’t been able to find a model that is both performant and useful on my machines (RTX 3060 12GB and M4 Mac mini), but I am open to suggestions! I know I want to use local LLMs more, but I feel that their utility is limited on consumer hardware
My Mac mini (32GB) can run 12B parameter models at around 13 tokens/sec, and my 3060 can achieve roughly double. However, both machines have a hard time keeping up with larger models. I’ll have to look into some special-purpose models