π§ Supported Models
Smaller Can Be Better!
Bigger isn't always better! Smaller models (0.5B-7B) can be surprisingly powerful and offer significant advantages: - β‘ Faster inference times - π° Lower hosting costs - π Quicker deployment - π Easier testing and iteration
How to read this
Please note that language support, performance, and other specifications may vary based on your specific use case, data, and fine-tuning process. This information is intended as general guidanceβyour results might differ significantly!
Model Overview
Qwen General Models
Qwen models excel at general text generation, understanding, and dialogue. They are particularly strong in Asian languages while still performing well in many Western languages.
Language Support:
- Primary: Chinese, English
- Strong: Japanese, Korean
- Good: German, French, Spanish, Portuguese, Italian, Vietnamese, Thai
- Basic: Arabic, Russian, and other less-represented European languages
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
Qwen2.5-0.5B | 0.5B | Small β | Testing, prototypes | Minimal π’ |
Qwen2.5-1.5B | 1.5B | Small β | Production-ready apps | Low π’ |
Qwen2.5-3B | 3B | Small β | Complex applications | Moderate π‘ |
Qwen2.5-7B | 7B | Medium | High-performance needs | Significant π‘ |
Qwen2.5-14B | 14B | Large β οΈ | Specific high-accuracy needs | High π΄ |
Qwen2.5-32B | 32B | Very Large β οΈ | Only when validated as necessary | Very High π΄ |
Qwen2.5-72B | 72B | Very Large β οΈ | Specialized enterprise needs | Extreme π΄ |
Qwen Code Models
Specialized for software development, these models excel at code generation, completion, and understanding. They support a wide range of programming languages and frameworks.
Language Support:
- Primary: Python, JavaScript, Java, C++, TypeScript
- Strong: Go, Rust, PHP, C#, Ruby
- Good: Swift, Kotlin, SQL, Shell scripting
- Basic: Scala, R, MATLAB, Assembly
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
Qwen2.5-Coder-0.5B | 0.5B | Small β | Code completion, simple generation | Minimal π’ |
Qwen2.5-Coder-1.5B | 1.5B | Small β | Most coding tasks | Low π’ |
Qwen2.5-Coder-3B | 3B | Small β | Complex code generation | Moderate π‘ |
Qwen2.5-Coder-7B | 7B | Medium | Large coding projects | Significant π‘ |
Qwen2.5-Coder-14B | 14B | Large β οΈ | Advanced code generation | High π΄ |
Qwen2.5-Coder-32B | 32B | Very Large β οΈ | Enterprise code solutions | Very High π΄ |
Qwen Math Models
Optimized for mathematical operations, these models excel at solving equations, proofs, and mathematical reasoning tasks.
Language Support:
- Primary: Mathematical notation, LaTeX
- Strong: English mathematical descriptions
- Good: Chinese mathematical descriptions
- Basic: Other language mathematical descriptions
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
Qwen2.5-Math-1.5B | 1.5B | Small β | Most math operations | Low π’ |
Qwen2.5-Math-7B | 7B | Medium | Complex calculations | Significant π‘ |
Qwen2.5-Math-72B | 72B | Very Large β οΈ | Research-grade math | Extreme π΄ |
LLaMA 3 Models
Metaβs LLaMA models are known for strong performance on text tasksβespecially in Englishβwhile typically being optimized for European languages.
Language Support:
- Primary: English
- Strong: Spanish, German, French
- Good: Italian, Portuguese, Dutch, and other major European languages
- Basic: Asian languages and Arabic
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
Llama-3.2-1B | 1B | Small β | Quick experiments | Minimal π’ |
Llama-3.2-3B | 3B | Small β | Small applications | Low π’ |
Llama-3.1-8B | 8B | Medium | Production apps | Significant π‘ |
Llama-3.1-70B | 70B | Very Large β οΈ | Enterprise needs | Extreme π΄ |
Code LLaMA Models
Specialized version of LLaMA focused on code generation with strong multilingual code capabilities.
Language Support:
- Primary: Python, JavaScript, Java, C++
- Strong: PHP, C#, Ruby, Go
- Good: Rust, Swift, TypeScript, Kotlin
- Basic: Most other programming languages
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
CodeLlama-7b | 7B | Medium | General coding | Significant π‘ |
CodeLlama-13b | 13B | Large β οΈ | Complex code projects | High π΄ |
CodeLlama-34b | 34B | Very Large β οΈ | Large-scale development | Very High π΄ |
CodeLlama-70b | 70B | Very Large β οΈ | Enterprise systems | Extreme π΄ |
Phi Models
Microsoftβs Phi models are designed for efficiency with a primary focus on English. They are especially strong in code generation (notably in Python and JavaScript), while their multilingual capabilities are more limited.
Language Support:
- Primary: English
- Strong (for code): Python, JavaScript
- Limited: Other languages (only basic multilingual support, with non-English tasks generally underperforming)
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
Phi-3.5-mini-instruct | Mini | Small β | Quick deployment | Minimal π’ |
Phi-3-mini-4k-instruct | Mini | Small β | Testing | Minimal π’ |
Phi-3-mini-128k-instruct | Mini | Small β | General tasks | Low π’ |
Phi-3-small-8k-instruct | Small | Small β | Small applications | Low π’ |
Phi-3-medium-4k-instruct | Medium | Medium | Medium workloads | Moderate π‘ |
Phi-3-medium-128k-instruct | Medium | Medium | Complex tasks | Moderate π‘ |
DeepSeek R1 Models
DeepSeek models are specifically optimized for reasoning tasks and complex problem-solving. These models are distilled from larger models while maintaining impressive performance, especially in mathematics and coding tasks.
Language Support:
- Primary: English, Chinese
- Strong: Math notation, Programming languages
- Good: European languages
- Basic: Other languages
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
DeepSeek-R1-Distill-Qwen-1.5B | 1.5B | Small β | Basic reasoning, Quick testing | Minimal π’ |
DeepSeek-R1-Distill-Qwen-7B | 7B | Medium | Math problems, Code generation | Moderate π‘ |
DeepSeek-R1-Distill-Llama-8B | 8B | Medium | General reasoning tasks | Moderate π‘ |
DeepSeek-R1-Distill-Qwen-14B | 14B | Large β οΈ | Complex problem solving | High π΄ |
DeepSeek-R1-Distill-Qwen-32B | 32B | Very Large β οΈ | Advanced reasoning, Research | Very High π΄ |
DeepSeek-R1-Distill-Llama-70B | 70B | Very Large β οΈ | Enterprise applications | Extreme π΄ |
Vision Models
Vision models combine text and image understanding capabilities for different specialized purposes.
ModelOne (manufactAI Labs)
Specialized model optimized for extracting structured information from documents and visual data.
Language Support:
- Primary: 70+ languages with balanced representation
- Core languages (14% each): English, Spanish, French, German, Italian, Russian
- Additional Support: 64 other languages
Special Capabilities:
- Structured data extraction from documents
- Complex table and chart interpretation
- Advanced multilingual OCR
- Format-flexible outputs (CSV, JSON, YAML, XML)
- Multi-page document processing
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
ModelOne-Vision | 4.3B | Small β | Document extraction, Structured data | Moderate π‘ |
ModelOne Dataset Coverage
Trained on diverse document types: - 49% Multipage Documents - 29% Real-world Images - 14% Single-page Documents - 8% Visual Representations (tables, charts)
Phi-3.5-Vision
A lightweight state-of-the-art multimodal model focused on general visual understanding and reasoning.
Language Support:
- Primary: English
- Strong: Common European languages
- Good: Asian languages
- Basic: Other languages
Model Name | Parameters | Size Category | Recommended Use Cases | Resource Impact |
---|---|---|---|---|
Phi-3.5-vision-instruct | 4.2B | Small β | General visual tasks, Multi-frame analysis | Moderate π‘ |
Best Practices:
- Ensure images are clear and well-lit
- Choose the appropriate model based on your specific use case:
- Phi-3.5-Vision for general visual understanding and reasoning
- ModelOne-Vision for structured document processing and data extraction with support for european languages
Making the Right Choice
Resource Impact Guide
- π’ Minimal/Low: Perfect for startups and individual developers
- π‘ Moderate: Requires careful resource planning
- π΄ High/Extreme: Significant infrastructure needed
When to Scale Up
Only consider larger models when you have:
- β Tested smaller models thoroughly
- β Identified specific performance gaps
- β Measured and justified the resource trade-offs
- β Budget for increased hosting costs
Next Steps
Quick Start with Small Models
Get started with efficient, powerful small models
Performance Benchmarking
Learn how to measure model performance
Need Help?
For most applications, start with models in the Small β category. They offer excellent performance while keeping costs and complexity manageable.