Adapters
Adapters are lightweight, parameter-efficient fine-tuning components that enable you to customize language models without modifying the entire model weights. Factory automates the creation, training, and management of adapters, making it easy to fine-tune models for specific tasks.
What is an Adapter?
An adapter in Factory is a trained component that:
- Modifies only a small subset of model parameters (typically less than 1%)
- Preserves the base model's general capabilities
- Adds domain-specific knowledge or task abilities
- Requires significantly less compute than full fine-tuning
Factory implements adapters using the LoRA (Low-Rank Adaptation) technique, which adds small trainable matrices to specific layers of the model.
Creating and Training an Adapter
To create and train an adapter in Factory:
When you call .run()
, Factory:
- Downloads the model and dataset locally
- Performs layer selection to determine which layers to fine-tune
- Estimates the optimal rank for LoRA adapters
- Trains the adapter with the specified hyperparameters
- Periodically evaluates and uploads checkpoints
- Tracks metrics and performance throughout training
Advanced Adapter Configuration
Training Arguments (TrainArgs)
These arguments control the overall training process:
Parameter | Description | Default |
---|---|---|
train_batch_size | Batch size for training | 8 |
eval_batch_size | Batch size for evaluation | 8 |
gradient_accumulation_steps | Number of steps to accumulate gradients before updating model | 1 |
num_train_epochs | Number of epochs to train | 3 |
max_train_steps | Maximum number of training steps (-1 for unlimited) | -1 |
learning_rate | Learning rate for training | 5e-5 |
eval_every_n_minutes | How often to evaluate and checkpoint (minutes) | 10 |
max_eval_samples | Maximum number of samples to use for evaluation | 1000 |
dtype | Data type for training ("fp16", "bf16", etc.) | "fp16" |
quantization_bits | Number of bits for quantization (4 for 4-bit training) | 4 |
attention_implementation | Attention implementation to use ("fa2" for Flash Attention 2) | "fa2" |
Adapter Arguments (AdapterArgs)
These arguments control the adapter architecture:
Parameter | Description | Default |
---|---|---|
rank | LoRA rank (dimensionality of low-rank matrices) or "auto" | "auto" |
alpha | LoRA scaling factor or "auto" | "auto" |
dropout | Dropout rate for LoRA modules | 0.1 |
layer_selection_percentage | Percentage of layers to select for training (0-1) | 0.5 |
target_modules | List of specific module types to target (None for auto) | None |
Initialization Arguments (InitArgs)
These arguments control the initialization phase:
Parameter | Description | Default |
---|---|---|
n_test_samples | Number of samples used for layer selection and rank estimation | 1000 |
Automatic Parameter Optimization
Factory includes intelligent systems to optimize adapter training:
Automatic Layer Selection
Setting layer_selection_percentage
(e.g., to 0.5) enables Factory to:
- Analyze model layers to determine their importance for your specific task
- Select the most sensitive layers that have the greatest impact on performance
- Only train adapters for these selected layers, improving efficiency
Automatic Rank Determination
When you set rank="auto"
, Factory:
- Analyzes activation patterns in selected layers
- Computes the optimal rank based on effective dimensionality
- Sets different ranks for different layers based on their complexity
- Balances performance vs. parameter count
Adapter Versioning
Like models and datasets, adapters are versioned in Factory:
- AdapterMeta: The main reference for an adapter under a specific name
- AdapterRevision: A specific version of an adapter with trained weights
When you update an adapter by training again with the same name, Factory creates a new revision while maintaining the connection to previous versions.
Monitoring Training Progress
During training, Factory automatically:
- Tracks training loss, learning rate, and evaluation metrics
- Monitors GPU utilization and performance statistics
- Creates checkpoints at regular intervals
- Uploads metrics and checkpoints to the Factory Hub
You can monitor all these metrics in real-time in the Factory Hub.
Example Workflow
Here's a complete example of creating, training, and using an adapter:
Best Practices for Adapter Training
- Start small: Begin with a small number of epochs (1-3) to assess performance
- Use automatic optimization: Let Factory select layers and ranks with "auto" settings
- Enable quantization: 4-bit quantization (the default) dramatically reduces memory usage
- Monitor evaluation metrics: Check the Factory Hub to see when performance plateaus
- Experiment with layer percentages: Try different values of
layer_selection_percentage
(0.3-0.7) - Adjust batch size: If you encounter out-of-memory errors, reduce batch size and increase gradient accumulation