Pretraining dataset for time series models with no GIFT-Eval leakage.
GiftEvalPretrain is a pretraining corpus aligned with the GIFT-Eval benchmark for time-series forecasting. It supplies 4.5 million series drawn from 88 source datasets that together cover seven domains and 13 sampling frequencies.
The resource supports development of foundation models that can be evaluated on GIFT-Eval without train-test contamination and is hosted for direct use in large-scale time-series pretraining workflows.
Use the 4.5 million series to train transformer-based forecasters on diverse frequencies and domains before fine-tuning on smaller target tasks.
Leverage the leakage-free splits to evaluate how well models generalize across the 13 frequencies and seven domains without train-test contamination.
Train models on the 17 multivariate collections to handle cross-variable dependencies in real-world settings such as energy or traffic.
from datasets import load_dataset
ds = load_dataset("Salesforce/GiftEvalPretrain")A large aggregated collection of 71 univariate and 17 multivariate time series datasets designed for pretraining forecasting models while avoiding leakage with GIFT-Eval benchmarks.
Verified reviews from the community shape this listing's rating.
Loading reviews…