If you've spent any time running local LLMs, you know the drill. You find a model that's smart enough but too slow, or fast ...