Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This page describes the model maintenance policy for the Foundation Model APIs pay-per-token, Foundation Model APIs provisioned throughput, Batch Inference with ai_query, and Foundation Model Fine-tuning offerings.
To continue supporting the most state-of-the-art models, Databricks manages models through a lifecycle that progresses from update to deprecation to eventual retirement.
- Update: Databricks applies incremental updates to a model to deliver optimizations. See Model updates.
- Deprecated: A deprecated model is no longer recommended for new workloads, but remains available in workspaces with existing usage of the model. Workspaces that aren't using the model at the time of deprecation no longer have access to it.
- Retired: A retired model is no longer accessible, and support for the model is fully discontinued. Any workload using the model stops working.
Model deprecation policy
When Databricks deprecates a model, that model is no longer recommended and is planned for retirement. Databricks announces retirement dates for deprecated models with the notification timelines summarized in the following sections. Retirement dates might be announced at the time of deprecation or at a later date. After the retirement date, the model is no longer accessible, and any workload using it stops working.
For deprecated and retired models and their announced retirement dates, see Deprecated and retired models. For partner models, see Partner model retirement policy.
Important
The deprecation policies that apply to the Foundation Model APIs pay-per-token and Foundation Model Fine-tuning offerings only impact supported chat and completion models.
Foundation Model APIs
The following table summarizes the deprecation policy for the Foundation Model APIs pay-per-token, Foundation Model APIs provisioned throughput, and Batch Inference with ai_query offerings.
| Deprecation notification | Transition to retirement | On the retirement date |
|---|---|---|
Databricks takes the following steps to notify customers about a model deprecation:
|
After deprecating a model, Databricks announces a retirement date three months or more in the future. During this transition period:
|
The model is no longer available for use and is removed from the product. Any existing workloads using the model stop working. Applicable documentation is updated to indicate that the model is no longer available, and to recommend a replacement model. |
Partner model retirement policy
Partner models are models that third-party partners — specifically OpenAI, Anthropic, and Google — provide through Foundation Model APIs. For these partner models, Databricks generally follows the same deprecation timelines and policies described above.
However, partners might provide retirement dates shorter than the three-month transition period that Databricks publishes. In these cases, Databricks attempts to bridge the gap by temporarily redirecting models to a similar version, so customers receive the full transition time.
For example, if a partner model retirement is announced with one month's lead time instead of three, Databricks redirects the model for an additional two months to prevent immediate breakage and allow time for migration. Queries fail at the end of the full three-month period.
Note
This redirection can only occur if the replacement model has the same price and is backwards compatible. The replacement model is usually an incremental model version, like 3.0 versus 3.1.
Foundation Model Fine-tuning
The following table summarizes the deprecation policy for Foundation Model Fine-tuning.
| Deprecation notification | Transition to retirement | On the retirement date |
|---|---|---|
Databricks takes the following steps to notify customers about a model deprecation:
|
After deprecating a model, Databricks announces a retirement date three months or more in the future. During this transition period, customers should migrate their workloads to a recommended replacement model, or delete the affected endpoint. | The model is no longer available for use and is removed from the product. Applicable documentation is updated to recommend using a replacement model. |
Model updates
Databricks might ship incremental model updates to deliver optimizations. When Databricks updates a model, the endpoint URL remains the same, but the model ID in the response object changes to reflect the date of the update. For example, if Databricks ships an update to meta-llama/Meta-Llama-3.3-70B on 3/4/2024, the model name in the response object updates to meta-llama/Meta-Llama-3.3-70B-030424. Databricks maintains a version history of the updates. Reach out to your Databricks account team for more details.
Deprecated and retired models
The following sections list models that are deprecated (no longer recommended for new workloads) or retired (end-of-life and no longer available). Retirement dates for deprecated models are announced at least three months in advance.
Foundation Model APIs retirements
The following table shows model retirements, their retirement dates, and recommended replacement models to use for Foundation Model APIs pay-per-token and provisioned throughput serving workloads. Databricks recommends that you migrate your applications to use replacement models before the indicated retirement date.
Note
OpenAI and Google Gemini models are only available through ADI Services, provided by Databricks.
| Partner model | Retirement date | Recommended replacement model |
|---|---|---|
| OpenAI GPT-5.1 Codex Max | Pay-per-token: July 16, 2026 | OpenAI GPT-5.5 |
| OpenAI GPT-5.1 Codex Mini | Pay-per-token: July 16, 2026 | OpenAI GPT-5.4 Codex Mini |
| OpenAI GPT-5.2 Codex | Pay-per-token: July 16, 2026 | OpenAI GPT-5.5 |
| Anthropic Claude 3.7 Sonnet | Pay-per-token: April 12, 2026 | Use the latest Claude Sonnet model |
| Gemini 3 Pro | Provisioned throughput: March 26, 2026 | Gemini 3.1 Pro. To allow more time for migration, between March 26, 2026 and June 7, 2026, API calls to Gemini 3 Pro will be temporarily redirected to Gemini 3.1 Pro. The pricing for both models is identical. |
| Open model | Retirement date | Recommended replacement model |
|---|---|---|
| Meta Llama 3.1 405B | Pay-per-token: February 15, 2026 Provisioned throughput: May 15, 2026 |
OpenAI GPT OSS 120B |
| DBRX / DBRX Instruct | Pay-per-token: April 30, 2025 Provisioned throughput: December 19, 2025 |
Pay-per-token: Meta-Llama-4-Maverick Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| Mixtral 8x7B / Mixtral-8x7B Instruct | Pay-per-token: April 30, 2025 Provisioned throughput: February 27, 2026 |
Pay-per-token: Meta-Llama-4-Maverick Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| Meta Llama 3 (70B) | Pay-per-token: July 23, 2024 (Meta-Llama-3-70B-Instruct); December 11, 2024 (Meta-Llama-3.1-70B-Instruct) Provisioned throughput: February 27, 2026 |
Pay-per-token: Meta-Llama-4-Maverick Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| Meta Llama 3 8B | Provisioned throughput: February 27, 2026 | Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| Meta Llama 2 70B / Meta-Llama-2-70B-Chat | Pay-per-token: October 30, 2024 Provisioned throughput: February 27, 2026 |
Pay-per-token: Meta-Llama-4-Maverick Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| Meta Llama 2 13B | Provisioned throughput: February 27, 2026 | Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| Meta Llama 2 7B | Provisioned throughput: February 27, 2026 | Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| Mistral 7B | Provisioned throughput: February 27, 2026 | Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| MPT 30B / MPT 30B Instruct | Pay-per-token: August 30, 2024 Provisioned throughput: December 19, 2025 |
Pay-per-token: Meta-Llama-4-Maverick Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
| MPT 7B / MPT 7B Instruct | Pay-per-token: August 30, 2024 Provisioned throughput: December 19, 2025 |
Pay-per-token: Meta-Llama-4-Maverick Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size. |
If you require long-term support for a specific model version, Databricks recommends using Foundation Model APIs provisioned throughput for your serving workloads.
Foundation Model Fine-tuning retirements
The following table shows retired model families, their retirement dates, and recommended replacement model families to use for Foundation Model Fine-tuning workloads. Databricks recommends that you migrate your applications to use replacement models before the indicated retirement date.
| Model family | Retirement date | Recommended replacement model family |
|---|---|---|
| DBRX | April 30, 2025 | Llama-3.1-70B |
| Mixtral | April 30, 2025 | Llama-3.1-70B |
| Mistral | April 30, 2025 | Llama-3.1-8B |
| Meta-Llama-3.1-405B | January 30, 2025 | Llama-3.1-70B |
| Meta-Llama-3 | January 7, 2025 | Meta-Llama-3.1 |
| Meta-Llama-2 | January 7, 2025 | Meta-Llama-3.1 |
| Code Llama | January 7, 2025 | Meta-Llama-3.1 |
Find workloads that use retired models
Use the following query to find workloads that are using deprecated models and identify their owners.
SELECT
eu.requester,
se.endpoint_name,
se.entity_name,
COUNT(*) AS request_count,
SUM(eu.input_token_count) AS total_input_tokens,
SUM(eu.output_token_count) AS total_output_tokens,
MIN(eu.request_time) AS first_request,
MAX(eu.request_time) AS last_request
FROM system.serving.endpoint_usage eu
JOIN system.serving.served_entities se
ON eu.served_entity_id = se.served_entity_id
WHERE LOWER(se.entity_name) LIKE '%<retired-model-name>%'
GROUP BY eu.requester, se.endpoint_name, se.entity_name
ORDER BY request_count DESC