Adapters

Adapters are lightweight, parameter-efficient fine-tuning components that enable you to customize language models without modifying the entire model weights. Factory automates the creation, training, and management of adapters, making it easy to fine-tune models for specific tasks.

What is an Adapter?

An adapter in Factory is a trained component that:

Modifies only a small subset of model parameters (typically less than 1%)
Preserves the base model's general capabilities
Adds domain-specific knowledge or task abilities
Requires significantly less compute than full fine-tuning

Factory implements adapters using the LoRA (Low-Rank Adaptation) technique, which adds small trainable matrices to specific layers of the model.

Creating and Training an Adapter

To create and train an adapter in Factory:

from factory_sdk import FactoryClient, TrainArgs, AdapterArgs, InitArgs
 
# Initialize the Factory client
factory = FactoryClient(
    tenant="your_tenant_name",
    project="your_project_name",
    token="your_api_key",
)
 
# Create and train the adapter
adapter = factory.adapter \
    .with_name("financial-phrases") \
    .based_on_recipe(recipe) \
    .using_model(model) \
    .with_hyperparameters(
        TrainArgs(
            train_batch_size=8,
            eval_batch_size=8,
            gradient_accumulation_steps=2,
            num_train_epochs=2,
            eval_every_n_minutes=2,
            max_eval_samples=100
        ),
        AdapterArgs(
            layer_selection_percentage=.5,
            rank="auto",
            alpha="auto"
        ),
        InitArgs(
            n_test_samples=200
        )
    ) \
    .run()

When you call .run(), Factory:

Downloads the model and dataset locally
Performs layer selection to determine which layers to fine-tune
Estimates the optimal rank for LoRA adapters
Trains the adapter with the specified hyperparameters
Periodically evaluates and uploads checkpoints
Tracks metrics and performance throughout training

Advanced Adapter Configuration

Training Arguments (TrainArgs)

These arguments control the overall training process:

Parameter	Description	Default
`train_batch_size`	Batch size for training	8
`eval_batch_size`	Batch size for evaluation	8
`gradient_accumulation_steps`	Number of steps to accumulate gradients before updating model	1
`num_train_epochs`	Number of epochs to train	3
`max_train_steps`	Maximum number of training steps (-1 for unlimited)	-1
`learning_rate`	Learning rate for training	5e-5
`eval_every_n_minutes`	How often to evaluate and checkpoint (minutes)	10
`max_eval_samples`	Maximum number of samples to use for evaluation	1000
`dtype`	Data type for training ("fp16", "bf16", etc.)	"fp16"
`quantization_bits`	Number of bits for quantization (4 for 4-bit training)	4
`attention_implementation`	Attention implementation to use ("fa2" for Flash Attention 2)	"fa2"

Adapter Arguments (AdapterArgs)

These arguments control the adapter architecture:

Parameter	Description	Default
`rank`	LoRA rank (dimensionality of low-rank matrices) or "auto"	"auto"
`alpha`	LoRA scaling factor or "auto"	"auto"
`dropout`	Dropout rate for LoRA modules	0.1
`layer_selection_percentage`	Percentage of layers to select for training (0-1)	0.5
`target_modules`	List of specific module types to target (None for auto)	None

Initialization Arguments (InitArgs)

These arguments control the initialization phase:

Parameter	Description	Default
`n_test_samples`	Number of samples used for layer selection and rank estimation	1000

Automatic Parameter Optimization

Factory includes intelligent systems to optimize adapter training:

Automatic Layer Selection

Setting layer_selection_percentage (e.g., to 0.5) enables Factory to:

Analyze model layers to determine their importance for your specific task
Select the most sensitive layers that have the greatest impact on performance
Only train adapters for these selected layers, improving efficiency

Automatic Rank Determination

When you set rank="auto", Factory:

Analyzes activation patterns in selected layers
Computes the optimal rank based on effective dimensionality
Sets different ranks for different layers based on their complexity
Balances performance vs. parameter count

Adapter Versioning

Like models and datasets, adapters are versioned in Factory:

AdapterMeta: The main reference for an adapter under a specific name
AdapterRevision: A specific version of an adapter with trained weights

When you update an adapter by training again with the same name, Factory creates a new revision while maintaining the connection to previous versions.

Monitoring Training Progress

During training, Factory automatically:

Tracks training loss, learning rate, and evaluation metrics
Monitors GPU utilization and performance statistics
Creates checkpoints at regular intervals
Uploads metrics and checkpoints to the Factory Hub

You can monitor all these metrics in real-time in the Factory Hub.

Example Workflow

Here's a complete example of creating, training, and using an adapter:

from factory_sdk import FactoryClient, TrainArgs, AdapterArgs, InitArgs
 
# Initialize client
factory = FactoryClient(
    tenant="your_tenant_name",
    project="your_project_name",
    token="your_api_key",
)
 
# Load base model
model = factory.base_model.with_name("qwen_small") \
    .from_open_weights("Qwen/Qwen2.5-0.5B-Instruct") \
    .save_or_fetch()
 
# Load and prepare dataset
data = load_dataset("takala/financial_phrasebank", "sentences_allagree")
data = data["train"].train_test_split(test_size=0.1, seed=42)
dataset = factory.dataset.with_name("financial-phrases") \
    .from_local(data) \
    .save_or_fetch()
 
# Create recipe
recipe = factory.recipe \
    .with_name("financial-phrases") \
    .using_dataset(dataset) \
    .with_preprocessor(your_processor_function) \
    .save_or_fetch()
 
# Create and train adapter
adapter = factory.adapter \
    .with_name("financial-phrases") \
    .based_on_recipe(recipe) \
    .using_model(model) \
    .with_hyperparameters(
        TrainArgs(train_batch_size=8, num_train_epochs=3),
        AdapterArgs(layer_selection_percentage=0.5),
        InitArgs(n_test_samples=500)
    ) \
    .run()
 
# Deploy the adapter
deployment = factory.deployment \
    .with_name("financial-sentiment") \
    .for_adapter(adapter) \
    .with_config(DeploymentArgs(port=8000)) \
    .run()

Best Practices for Adapter Training

Start small: Begin with a small number of epochs (1-3) to assess performance
Use automatic optimization: Let Factory select layers and ranks with "auto" settings
Enable quantization: 4-bit quantization (the default) dramatically reduces memory usage
Monitor evaluation metrics: Check the Factory Hub to see when performance plateaus
Experiment with layer percentages: Try different values of layer_selection_percentage (0.3-0.7)
Adjust batch size: If you encounter out-of-memory errors, reduce batch size and increase gradient accumulation

Adapters

On this page