AI Models Information

Llama 3.3 Model

Meta's Llama 3.3 is a 70B parameter model, serving as an alternative to Llama 3.1 405B. It's the third model in the Llama series, following Llama 3.1 and Llama 3.2.

Model Architecture

  • Auto-regressive text generation
  • Grouped Query Attention (GQA) for efficient inference
  • 128K vocabulary tokenizer
  • Context length up to 128k tokens

Training Methodology

  • Supervised Fine-Tuning (SFT)
  • Reinforcement Learning with Human Feedback (RLHF)
  • Trained on 15+ trillion tokens
  • Knowledge cutoff: December 2023

Supported Languages

English
German
French
Italian
Portuguese
Hindi
Spanish
Thai

Key Features:

  • • High performance on benchmark datasets
  • • Efficient inference with GQA
  • • Open-source and free to use
  • • Extensive multilingual support

Qwen 2.5 Model

We use Qwen2.5-72B-Instruct, a powerful language model with the following capabilities:

Key Features

  • Advanced coding and mathematics capabilities
  • Excellent instruction following
  • Long text generation (over 8K tokens)
  • Structured data understanding (tables, JSON)
  • Support for 29+ languages

Model Specifications

  • 72.7B Parameters
  • 80 Layers
  • Context Length: Up to 128K tokens
  • Generation Length: Up to 8K tokens

Best Practices:

  • • Be clear and specific in your instructions
  • • For complex tasks, break them down into steps
  • • Specify the desired format for structured outputs
  • • Use system prompts to set the context when needed

Sources: