Qwen 2.5 Coder 1.5B
qwen2.5-coder:1.5b
qwen2.5-codercoding
Tiny coding model for fast autocomplete on 8 GB Macs.
- Runtimes
- ollama · mlx
- Memory
- ≥ 2.1 GB · 8.6 GB recommended
Best as a draft/autocomplete model — not for long-form code generation.
Community-curated · added 2026-05-28
Qwen 2.5 Coder 7B
qwen2.5-coder:7b
qwen2.5-codercoding
Solid coding default for 16 GB+ Macs; strong instruction following.
- Runtimes
- ollama · mlx
- Memory
- ≥ 8.6 GB · 17.2 GB recommended
Community-curated · added 2026-05-28
Qwen 2.5 Coder 14B
qwen2.5-coder:14b
qwen2.5-codercoding
Stronger coding — fits 24 GB+ Apple Silicon at default quant.
- Runtimes
- ollama · mlx
- Memory
- ≥ 17.2 GB · 25.8 GB recommended
Supersedes
- qwen2.5-coder:7b — Materially better at multi-file refactors and longer code spans, given the headroom.
Community-curated · added 2026-05-28
Qwen 2.5 Coder 32B
qwen2.5-coder:32b
qwen2.5-codercoding
Frontier-tier local coding model — fits M-series Max/Ultra.
- Runtimes
- ollama · mlx
- Memory
- ≥ 25.8 GB · 38.7 GB recommended
Supersedes
- qwen2.5-coder:14b — Better at architectural reasoning and unfamiliar codebases; needs the memory.
Community-curated · added 2026-05-28
chat
Small, fast chat — viable on 8 GB Macs and as a low-latency draft model.
- Runtimes
- ollama · mlx · lmstudio
- Memory
- ≥ 4.3 GB · 8.6 GB recommended
Community-curated · added 2026-05-28
chatrag
Strong general chat at 7B. Use Coder variant for code-heavy work.
- Runtimes
- ollama · mlx
- Memory
- ≥ 8.6 GB · 17.2 GB recommended
Community-curated · added 2026-05-28
chatrag
Older default — still solid for chat/RAG with the broadest runtime support.
- Runtimes
- ollama · mlx · lmstudio · llamacpp
- Memory
- ≥ 8.6 GB · 17.2 GB recommended
Community-curated · added 2026-05-28
chatrag
Capable general chat — preferred over 7B once memory allows.
- Runtimes
- ollama · mlx
- Memory
- ≥ 17.2 GB · 25.8 GB recommended
Supersedes
- qwen2.5:7b — Better at longer context and nuanced instruction following.
Community-curated · added 2026-05-28
reasoningchat
Compact reasoning model — visible chain-of-thought useful for code planning + math.
- Runtimes
- ollama · mlx
- Memory
- ≥ 8.6 GB · 17.2 GB recommended
Verbose by default — outputs <think> blocks. Strip or surface them depending on the surface.
Community-curated · added 2026-05-28
reasoningchat
Mid-tier reasoning — stronger on multi-step problems than the 7B at the memory cost.
- Runtimes
- ollama · mlx
- Memory
- ≥ 17.2 GB · 25.8 GB recommended
Supersedes
- deepseek-r1-distill-qwen:7b — Materially better at multi-step reasoning; takes the headroom seriously.
Community-curated · added 2026-05-28
Nomic Embed Text v1.5
nomic-embed-text
nomicembedding
Default embedding model for local RAG. Tiny footprint, broad runtime support.
- Runtimes
- ollama · mlx
- Memory
- ≥ 1.1 GB · 2.1 GB recommended
Community-curated · added 2026-05-28
MixedBread Embed Large
mxbai-embed-large
mxbaiembedding
Alternative embedding — competitive quality, slightly larger vectors than Nomic.
- Runtimes
- ollama
- Memory
- ≥ 1.1 GB · 2.1 GB recommended
Community-curated · added 2026-05-28
Llama 3.2 Vision 11B
llama3.2-vision:11b
llama3.2visionchat
Image-aware chat — for screenshot Q&A, OCR-like extraction, and visual reasoning.
- Runtimes
- ollama
- Memory
- ≥ 12.9 GB · 17.2 GB recommended
Ollama-only at default quant today. MLX support exists for separate weights.
Community-curated · added 2026-05-28
Whisper Large v3
whisper-large-v3
whisperstt
Gold-standard local transcription — strong multilingual + low WER.
- Runtimes
- llamacpp · mlx
- Memory
- ≥ 4.3 GB · 8.6 GB recommended
Typically loaded via whisper.cpp or mlx-whisper, not via Ollama. Used by Hermes for STT.
Community-curated · added 2026-05-28
Distil-Whisper Large v3
distil-whisper-large-v3
whisperstt
~6× faster than Whisper Large v3 with a small accuracy trade-off — good for live transcription.
- Runtimes
- llamacpp · mlx
- Memory
- ≥ 2.1 GB · 4.3 GB recommended
Supersedes
- whisper-large-v3 — Materially faster TTS-to-text loop; preferred for real-time use cases.
Optimized for English. Pair with Whisper Large for multilingual.
Community-curated · added 2026-05-28