Model Migration Process
Current Migration Issues
The table below shows current open issues labeled with AI Model Migration
. This provides a live view of ongoing model migration work across GitLab.
display: table
fields: title, author, assignee, milestone, labels, updated
limit: 10
query: label = "AI Model Migration" AND opened = true
Note: This table is dynamically generated using GitLab Query Language (GLQL) when viewing the rendered documentation. It shows up to 10 open issues with the AI Model Migration label, sorted by most recently updated.
Quick Links
- GitLab AI Features - Default GitLab AI Vendor Models: View all features and their current model mappings
- AI Model Version Migration Initiative Epic: Central tracking epic for all model migration work
- AI Gateway Repository: Where model configurations are managed
- Prompt Library: For evaluating models and prompts
Introduction
LLM models are constantly evolving, and GitLab needs to regularly update our AI features to support newer models. This guide provides a structured approach for migrating AI features to new models while maintaining stability and reliability.
Model Migration Timelines
Model migrations typically follow these general timelines:
Simple Model Updates (Same Provider): 1-2 weeks
- Example: Upgrading from Claude Sonnet 3.5 to 3.7
- Involves model validation, testing, and staged rollout
- Primary focus on maintaining stability and performance
Complex Migrations: 1-2 months (full milestone or longer)
- Example: Adding support for a new provider like AWS Bedrock
- Example: Major version upgrades with breaking changes (e.g., Claude 2 to 3)
- Requires significant API integration work
- May need infrastructure changes
Timeline Factors
Several factors can impact migration timelines:
- Current system stability and recent incidents
- Resource availability and competing priorities
- Complexity of behavioral changes in new model
- Scale of testing required
- Feature flag rollout strategy
Best Practices
- Always err on the side of caution with initial timeline estimates
- Use feature flags for gradual rollouts to minimize risk
- Plan for buffer time to handle unexpected issues
- Prioritize system stability over speed of deployment
While some migrations can technically be completed quickly, we typically plan for longer timelines to ensure proper testing and staged rollouts. This approach helps maintain system stability and reliability.
Team Responsibilities
Model migrations involve several teams working together. This section clarifies which teams are responsible for different aspects of the migration process.
RACI Matrix for Model Migrations
Task | AI Framework | Feature Teams | Product | Infrastructure |
---|---|---|---|---|
Model configuration file creation | R/A | C | I | I |
Infrastructure compatibility | R/A | I | I | C |
Feature-specific prompt adjustments | C | R/A | I | I |
Evaluations & testing | C | R/A | I | I |
Feature flag implementation | C | R/A | I | I |
Rollout planning | C | R/A | C | I |
Documentation updates | C | R/A | C | I |
Monitoring & incident response | C | R/A | I | C |
R = Responsible, A = Accountable, C = Consulted, I = Informed
Migration Process
Model Mapping Resource: You can see which features use which models and versions via the GitLab AI Features - Default GitLab AI Vendor Models page.
Standard Migration Process
Initialization
- AI Framework team creates an Issue in the AI Model Version Migration Initiative Epic
- Issue should use the naming convention:
AI Model Migration - Provider/Model/Version
- Apply the
AI Model Migration
label - AI Framework team adds model configuration to AI Gateway
- AI Framework team verifies infrastructure compatibility
Feature Team Implementation
- Feature teams create implementation plans
- Feature teams adjust prompts if needed
- Feature teams implement feature flags for controlled rollout
Testing & Validation
- Feature teams run evaluations against the new model
- AI Framework team provides evaluation support
Deployment
- Feature teams manage feature flag rollout
- Feature teams monitor performance and make adjustments
Completion
- Feature teams remove feature flags when migration is complete
- Feature teams update documentation
Model Deprecation Process
Identification & Planning
- AI Framework team monitors provider announcements
- AI Framework team creates an epic:
Replace discontinued [model] with [replacement]
- Epic should have the
AI Model Migration
label - Set due date at least 2-4 weeks before provider’s cutoff date
- AI Framework team identifies replacement models
Evaluation
- AI Framework team evaluates replacement models
- Feature teams test affected features with candidates
- Teams determine the best replacement model
Implementation
- AI Framework team creates model configuration files
- Feature teams update features to use the replacement model
- Teams implement feature flags for controlled rollout
Testing
- Feature teams run comprehensive evaluations
- Teams document performance metrics
Deployment
- Feature teams manage phased rollout via feature flags
- Teams monitor performance closely
- Rollout expands gradually based on performance
Completion
- Remove feature flags when migration is complete
- Update documentation
- Clean up deprecated model references
Prerequisites for Model Migration
Before starting a model migration:
Create an issue under the AI Model Version Migration Initiative epic:
- Label with
group::ai framework
andAI Model Migration
- Document behavioral changes or improvements
- Include any breaking changes or compatibility issues
- Reference provider documentation
- Label with
Verify model support in AI Gateway:
- Check model definitions:
- For LiteLLM models:
ai_gateway/models/v2/container.py
- For Anthropic models:
ai_gateway/models/anthropic.py
- For new providers: Create new model definition file
- For LiteLLM models:
- Verify configurations (enums, stop tokens, timeouts, etc.)
- Test the model locally:
- Set up the AI gateway development environment
- Configure API keys in
.env
file - Test using Swagger UI at
http://localhost:5052/docs
- Create an issue for new model support if needed
- Review provider API documentation for breaking changes
- Check model definitions:
Ensure access to testing environments and monitoring tools
Complete model evaluation using the Prompt Library
Additional Prerequisites for Model Deprecations
For model deprecations:
Create an epic when a deprecation is announced:
- Label with
group::ai framework
andAI Model Migration
- Document the deprecation timeline
- Include provider migration recommendations
- Reference the deprecation announcement
- List all affected features
- Label with
Evaluate replacement models:
- Document evaluation criteria
- Run comparative evaluations
- Consider regional availability
- Assess infrastructure changes required
Create migration timeline:
- Set completion target at least 2-4 weeks before cutoff
- Include time for each feature update
- Plan for gradual rollout
- Allow time for infrastructure changes
Documentation of model changes and deprecations is crucial for tracking impact and future troubleshooting. Always create an issue before beginning any migration process.
Implementation Guidelines
Feature Team Migration Template
Feature teams should use the AI Model Rollout template to implement model migrations. See an example from our Claude 3.7 Sonnet Code Generation Rollout Plan.
Anthropic Model Migration Tasks
AI Framework Team:
- Add new model to AI gateway configurations
- Verify compatibility with current API specification
- Verify the model works with existing API patterns
- Create model configuration file
- Document model-specific parameters or behaviors
- Verify infrastructure compatibility
- Update model definitions following prompt definition guidelines
Feature Team:
- Add new model to available models list
- Change default model in AI-Gateway client behind feature flag
- Update model references in feature-specific code
- Implement feature flags for controlled rollout
- Test prompts with new model
- Monitor performance during rollout
- Update documentation
While we’re moving toward AI gateway holding the prompts, feature flag implementation still requires a GitLab release.
Vertex Models Migration Tasks
AI Framework Team:
- Activate model in Google Cloud Platform
- Update AI gateway to support new Vertex model
- Document model-specific parameters
Feature Team:
- Update model references in feature-specific code
- Implement feature flags for controlled rollout
- Test prompts with new model
- Monitor performance during rollout
- Update documentation
Feature Flag Implementation
Implementation Steps
For implementing feature flags, refer to our Feature Flags Development Guidelines.
Feature flag implementations will affect self-hosted cloud-connected customers. These customers won’t receive the model upgrade until the feature flag is removed from the AI gateway codebase, as they won’t have access to the new GitLab release.
Model Selection Implementation
Implement model selection logic in:
- AI gateway client (
ee/lib/gitlab/llm/chain/requests/ai_gateway.rb
) - Model definitions in AI gateway
- Any custom implementations in specific features
Rollout Strategy
- Enable feature flag for small percentage of users/groups
- Monitor performance using:
- Gradually increase rollout percentage
- If issues arise, disable feature flag to rollback
- Once stable, remove feature flag
Common Migration Scenarios
Simple Model Version Update (Same Provider)
Example: Upgrading from Claude 3.5 to Claude 3.7
AI Framework Team:
- Create migration issue
- Add model configuration file
- Verify API compatibility
- Ensure infrastructure support
Feature Teams:
- Create implementation issues
- Test prompts with new model
- Implement feature flags
- Monitor performance
- Remove feature flags when stable
New Provider Integration
Example: Adding AWS Bedrock models
AI Framework Team:
- Create integration plan
- Implement provider API in AI gateway
- Create model configuration files
- Update authentication mechanisms
- Document provider-specific parameters
- Evaluate model performance
Feature Teams:
- Evaluate feature quality and performance with the new model
- Adapt prompts for new provider’s models
- Implement feature flags
- Deploy and monitor
- Update documentation
Model Deprecation Response
Example: Replacing discontinued Vertex AI Code Gecko v2
AI Framework Team:
- Create epic to track deprecation
- Evaluate replacement models
- Create model configuration
- Document routing logic
- Verify infrastructure compatibility
Feature Teams:
- Implement routing logic
- Create feature flags for transition
- Run evaluations
- Implement staged rollout
- Monitor performance during transition
Troubleshooting Guide
Prompt Compatibility Issues
If you encounter prompt compatibility issues:
Analyze Errors:
- Enable “expanded AI logging” to capture model responses
- Check for “LLM didn’t follow instructions” errors
- Review model outputs for unexpected patterns
Resolve Issues:
- Create new prompt version (following semantic versioning)
- Test prompt variations in evaluation environment
- Use feature flags to control prompt deployment
- Monitor performance during rollout
Example: Claude 3.5 to 3.7 Migration
For Claude 3.7 migrations:
- Create new version 2.0.0 prompt definition
- Implement feature flag for prompt version control
- Use AI Framework team’s model configuration file
- Run evaluations to verify performance
- Roll out gradually and monitor
AI Framework Team Migration Issue Template
The AI Framework team should create a main migration issue following this template:
# [Model Name] Model Upgrade
## Overview
[Brief description of the new model and its improvements]
## Features to Update
[List of features affected by this migration, organized by category]
### Generally Available Features
- [Feature 1]
- [Feature 2]
### Beta Features
- [Beta Feature 1]
### Experimental Features
- [Experimental Feature 1]
## Required Changes
- Add model configuration file for model flexibility
- New prompt definition created to use the new model
- Feature flag created for controlled rollout
## Technical Details
- [Any technical specifics about this migration]
- [Impact on GitLab.com and self-managed instances]
## Implementation Steps
- [ ] Update model configurations in each feature
- [ ] Verify performance improvements
- [ ] Deploy updates
- [ ] Update documentation
## Timeline
Priority: [Priority level]
## References
- [Model Announcement]
- [Model Documentation]
- [GitLab Documentation]
- [Other relevant links]
## Proposed Solution
[Description of the high-level implementation approach]
## Implementation Details
Please follow the issues below with the associated rollout plans:
| Feature | DRI | ETA | Issue Link |
|---------|-----|-----|------------|
| [Feature 1] | [@username] | [Date] | [Issue link] |
| [Feature 2] | [@username] | [Date] | [Issue link] |
See an example in our Claude 3.7 Model Upgrade issue.
References
Docs
Edit this page to fix an error or add an improvement in a merge request.
Create an issue to suggest an improvement to this page.
Product
Create an issue if there's something you don't like about this feature.
Propose functionality by submitting a feature request.
Feature availability and product trials
View pricing to see all GitLab tiers and features, or to upgrade.
Try GitLab for free with access to all features for 30 days.
Get help
If you didn't find what you were looking for, search the docs.
If you want help with something specific and could use community support, post on the GitLab forum.
For problems setting up or using this feature (depending on your GitLab subscription).
Request support