File size: 6,426 Bytes
a03a45d a9a2dde a03a45d a9a2dde a9221ed 487f7ef a9a2dde 799306e a9a2dde 799306e a03a45d a9a2dde a03a45d 4d38ee3 f037910 a9a2dde 799306e a03a45d 4d38ee3 a03a45d 253e68a 23b17a5 a9cbd8d 23b17a5 253e68a a03a45d 95e3dfd 4ed4242 a9cbd8d 4ed4242 4d38ee3 a03a45d 23b17a5 487f7ef a03a45d 487f7ef a03a45d 021b4be a03a45d 291c570 a03a45d 291c570 487f7ef a03a45d 487f7ef 349d00a 4ed2b06 349d00a a5befbf 349d00a c3a2fdb 487f7ef c3a2fdb 349d00a 487f7ef 349d00a 487f7ef c3a2fdb 349d00a 487f7ef 349d00a 487f7ef 349d00a 249dbc5 487f7ef 349d00a a03a45d 487f7ef a03a45d 487f7ef 349d00a 487f7ef a03a45d 7fb145d 487f7ef 7fb145d 8d5f336 487f7ef 64893e3 487f7ef 7fb145d 487f7ef 039ae8c a03a45d 7fb145d 63ed37a 487f7ef 799306e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
---
model_id: Toto-Open-Base-1.0
tags:
- time-series-forecasting
- foundation models
- pretrained models
- time series foundation models
- time series
- time-series
- timeseries
- transformers
- forecasting
- safetensors
- observability
paper:
- - Link to Paper
datasets:
- Salesforce/GiftEvalPretrain
- autogluon/chronos_datasets
leaderboards:
- GiftEval (if results are public)#TODO(Anna) check how to do that
- BOOM (if results are public)#TODO(Anna) check how to do that
license: apache-2.0
pipeline_tag: time-series-forecasting
---
# Toto-Open-Base-1.0
Toto (Time Series Optimized Transformer for [Observability](https://www.datadoghq.com/knowledge-center/observability/)) is a **state-of-the-art** time-series foundation model designed for multi-variate time series forecasting, emphasizing observability metrics. Toto efficiently handles high-dimensional, sparse, and non-stationary data commonly encountered in observability scenarios.
<div style="width: 100%; margin: auto; padding: 1rem;">
<img src="figures/rankings.png" alt="model ranking" style="width: 100%; height: auto;" />
<em style="display: block; margin-top: 0.5rem; text-align: center;">
The average rank of Toto compared to the runner-up models on both the <a href="https://huggingface.co/spaces/Salesforce/GIFT-Eval">GIFT-Eval</a> and <a href="https://huggingface.co/datasets/Datadog/BOOM">BOOM</a> benchmarks (as of May 19, 2025).
</em>
</div>
---
## ✨ Key Features
- **Zero-Shot Forecasting**: Perform forecasting without fine-tuning on your specific time series.
- **High-Dimension Multi-Variate Support**: Efficiently process multiple variables using Proportional Factorized Space-Time Attention.
- **Decoder-Only Transformer Architecture**: Support for variable prediction horizons and context lengths.
- **Probabilistic Predictions**: Generate both point forecasts and uncertainty estimates using a Student-T mixture model.
- **Extensive Pretraining on Large-Scale Data**: Trained on over 2 trillion time series data points, the largest pretraining dataset for any open-weights time series foundation model to date.
- **Tailored for Observability Metrics with State-of-the-Art Performance** on [GIFT-Eval](https://huggingface.co/spaces/Salesforce/GIFT-Eval) and [BOOM](https://huggingface.co/datasets/Datadog/BOOM).
<div style="width: 100%; margin: auto; padding: 1rem;">
<img src="figures/architecture.png" alt="model architecture" style="width: 100%; height: auto;" />
<em style="display: block; margin-top: 0.5rem; text-align: center;">
Overview of Toto-Open-Base-1.0 architecture.
</em>
</div>
---
## 📚 Training Data Summary
- **Observability Metrics:** ~1 trillion points from Datadog internal systems (no customer data)
- **Public Datasets:**
- [GIFT-Eval Pretrain](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain)
- [Chronos datasets](https://huggingface.co/datasets/autogluon/chronos_datasets)
- **Synthetic Data:** ~1/3 of training data
---
## ⚡ Quick Start: Model Inference
Inference code is available on [GitHub](https://github.com/DataDog/toto).
### Installation
```bash
pip install toto-ts
```
For optimal speed and reduced memory usage, you should also install [xFormers](https://github.com/facebookresearch/xformers) and [flash-attention](https://github.com/Dao-AILab/flash-attention)
### 🚀 Inference Example
Here's how to quickly generate forecasts using Toto:
⚠️ In our study, we take the **median** across 256 samples to produce a point forecast. This tutorial previously used the **mean** but has now been updated.
```python
import torch
from toto.data.util.dataset import MaskedTimeseries
from toto.inference.forecaster import TotoForecaster
from toto.model.toto import Toto
DEVICE = 'cuda'
# Load pre-trained Toto model
toto = Toto.from_pretrained('Datadog/Toto-Open-Base-1.0').to(DEVICE)
# Optional: compile model for enhanced speed
toto.compile()
forecaster = TotoForecaster(toto.model)
# Example input series (7 variables, 4096 timesteps)
input_series = torch.randn(7, 4096).to(DEVICE)
timestamp_seconds = torch.zeros(7, 4096).to(DEVICE)
time_interval_seconds = torch.full((7,), 60*15).to(DEVICE)
inputs = MaskedTimeseries(
series=input_series,
padding_mask=torch.full_like(input_series, True, dtype=torch.bool),
id_mask=torch.zeros_like(input_series),
timestamp_seconds=timestamp_seconds,
time_interval_seconds=time_interval_seconds,
)
# Generate forecasts for next 336 timesteps
forecast = forecaster.forecast(
inputs,
prediction_length=336,
num_samples=256,
samples_per_batch=256,
)
# Access results
median_prediction = forecast.median
prediction_samples = forecast.samples
lower_quantile = forecast.quantile(0.1)
upper_quantile = forecast.quantile(0.9)
```
For detailed inference instructions, refer to the [inference tutorial notebook](https://github.com/DataDog/toto/blob/main/toto/notebooks/inference_tutorial.ipynb).
---
### 💾 Available Checkpoints
| Checkpoint | Parameters | Config | Size | Notes |
|------------|------------|--------|------|-------|
| [Toto-Open-Base-1.0](https://huggingface.co/Datadog/Toto-Open-Base-1.0/blob/main/model.safetensors) | 151M | [Config](https://huggingface.co/Datadog/Toto-Open-Base-1.0/blob/main/config.json) | 605 MB | Initial release with SOTA performance |
## 🔗 Additional Resources
- **[Research Paper](https://arxiv.org/abs/2505.14766)**
- **[GitHub Repository](https://github.com/DataDog/toto.git)**
- **[Blog Post](https://www.datadoghq.com/blog/ai/toto-boom-unleashed/)**
- **[BOOM Dataset](https://huggingface.co/datasets/Datadog/BOOM)**
---
## 📖 Citation
If you use Toto in your research or applications, please cite us using the following:
```bibtex
@misc{cohen2025timedifferentobservabilityperspective,
title={This Time is Different: An Observability Perspective on Time Series Foundation Models},
author={Ben Cohen and Emaad Khwaja and Youssef Doubli and Salahidine Lemaachi and Chris Lettieri and Charles Masson and Hugo Miccinilli and Elise Ramé and Qiqi Ren and Afshin Rostamizadeh and Jean Ogier du Terrail and Anna-Monica Toon and Kan Wang and Stephan Xie and Zongzhe Xu and Viktoriya Zhukova and David Asker and Ameet Talwalkar and Othmane Abou-Amal},
year={2025},
eprint={2505.14766},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.14766},
}
``` |