Meta: Llama 3.2 11B Vision Instruct
meta-llama 🔮 Multimodal
About Meta: Llama 3.2 11B Vision Instruct
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...
More Multimodal Models
MoonshotAI: Kimi K2.7 Code
moonshotai
MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts.
Nex AGI: Nex-N2-Pro (free)
nex-agi
Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total.
NVIDIA: Nemotron 3.5 Content Safety (free)
nvidia
NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B.
Qwen: Qwen3.7 Plus
qwen
Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series.