# TS2Vec TS2Vec is a universal framework for learning representations of time series data through contrastive learning. Developed as part of the research paper "TS2Vec: Towards Universal Representation of Time Series" (AAAI-22), it enables learning timestamp-level and instance-level representations that can be used for various downstream tasks including classification, forecasting, and anomaly detection. The framework employs hierarchical contrastive learning with temporal and instance-based contrasts at multiple scales. By using dilated convolutional networks and contextual consistency techniques, TS2Vec produces robust representations that maintain semantic consistency across augmented views of time series data. The learned representations generalize well across different domains and tasks without requiring task-specific architectural modifications. ## TS2Vec Model Initialization The TS2Vec class is the main interface for creating and training time series representation models. It accepts configuration for input/output dimensions, network depth, device placement, and training hyperparameters. ```python from ts2vec import TS2Vec import numpy as np # Initialize a TS2Vec model for univariate time series model = TS2Vec( input_dims=1, # Number of input features (1 for univariate) output_dims=320, # Representation dimension hidden_dims=64, # Hidden layer dimension depth=10, # Number of residual blocks device='cuda', # Device for training ('cuda' or 'cpu' or GPU index) lr=0.001, # Learning rate batch_size=16, # Training batch size max_train_length=3000, # Maximum sequence length (longer sequences are cropped) temporal_unit=0 # Minimum temporal unit for contrastive learning ) # Initialize for multivariate time series (e.g., 7 features) multivariate_model = TS2Vec( input_dims=7, output_dims=320, hidden_dims=64, depth=10, device=0 # Use GPU 0 ) ``` ## Training with fit() The fit method trains the model on time series data using hierarchical contrastive learning. It accepts training data as a 3D numpy array and supports iteration or epoch-based training with optional callbacks for checkpointing. ```python from ts2vec import TS2Vec import numpy as np # Generate sample training data: 100 instances, 200 timestamps, 3 features train_data = np.random.randn(100, 200, 3).astype(np.float32) model = TS2Vec( input_dims=3, output_dims=320, device='cuda' ) # Train for a fixed number of iterations loss_log = model.fit( train_data, n_iters=200, # Train for 200 iterations verbose=True # Print loss after each epoch ) # Output: Epoch #0: loss=1.234, Epoch #1: loss=0.987, ... # Alternative: Train for a fixed number of epochs loss_log = model.fit( train_data, n_epochs=10, verbose=True ) # Training with callbacks for checkpointing def checkpoint_callback(model, loss): if model.n_epochs % 5 == 0: model.save(f'checkpoint_epoch_{model.n_epochs}.pkl') model_with_callback = TS2Vec( input_dims=3, output_dims=320, device='cuda', after_epoch_callback=checkpoint_callback ) loss_log = model_with_callback.fit(train_data, n_epochs=20, verbose=True) ``` ## Encoding with encode() The encode method computes representations for time series data. It supports multiple encoding modes including timestamp-level, instance-level, multiscale, and sliding window inference for efficient processing of long sequences. ```python from ts2vec import TS2Vec import numpy as np # Sample test data: 50 instances, 150 timestamps, 3 features test_data = np.random.randn(50, 150, 3).astype(np.float32) # Assume model is already trained model = TS2Vec(input_dims=3, output_dims=320, device='cuda') # model.fit(train_data, n_iters=200) # Timestamp-level representations (default) repr_timestamp = model.encode(test_data) # Shape: (50, 150, 320) - one representation per timestamp # Instance-level representations (pooled over entire series) repr_instance = model.encode(test_data, encoding_window='full_series') # Shape: (50, 320) - one representation per instance # Multiscale representations repr_multiscale = model.encode(test_data, encoding_window='multiscale') # Shape: (50, 150, 320*num_scales) - concatenated multiscale features # Fixed window pooling repr_windowed = model.encode(test_data, encoding_window=10) # Shape: (50, 150, 320) - pooled over windows of size 10 # Sliding inference for long sequences (causal mode for forecasting) repr_sliding = model.encode( test_data, causal=True, # Only use past information sliding_length=1, # Slide by 1 timestamp sliding_padding=50, # Use 50 timestamps of context batch_size=256 ) # Shape: (50, 150, 320) - each timestamp uses only [t-50, t] context # Apply masking during encoding repr_masked = model.encode(test_data, mask='mask_last') # Masks the last timestamp (useful for anomaly detection) ``` ## Saving and Loading Models The save and load methods enable persisting trained models to disk and restoring them for later use. Models are saved as PyTorch state dictionaries. ```python from ts2vec import TS2Vec import numpy as np # Train and save a model train_data = np.random.randn(100, 200, 1).astype(np.float32) model = TS2Vec(input_dims=1, output_dims=320, device='cuda') model.fit(train_data, n_iters=200, verbose=True) # Save the trained model model.save('trained_model.pkl') # Load the model for inference loaded_model = TS2Vec(input_dims=1, output_dims=320, device='cuda') loaded_model.load('trained_model.pkl') # Use loaded model for encoding test_data = np.random.randn(10, 200, 1).astype(np.float32) representations = loaded_model.encode(test_data, encoding_window='full_series') print(f"Representations shape: {representations.shape}") # (10, 320) ``` ## Loading UCR Classification Datasets The load_UCR function loads univariate time series classification datasets from the UCR archive. It handles data loading, label transformation, and optional normalization automatically. ```python import datautils # Load the ECG200 dataset train_data, train_labels, test_data, test_labels = datautils.load_UCR('ECG200') print(f"Train data shape: {train_data.shape}") # (100, 96, 1) print(f"Train labels shape: {train_labels.shape}") # (100,) print(f"Test data shape: {test_data.shape}") # (100, 96, 1) print(f"Unique labels: {set(train_labels)}") # {0, 1} # Load another dataset - GunPoint train_data, train_labels, test_data, test_labels = datautils.load_UCR('GunPoint') print(f"GunPoint train shape: {train_data.shape}") # (50, 150, 1) # Load FordA (larger dataset) train_data, train_labels, test_data, test_labels = datautils.load_UCR('FordA') print(f"FordA train shape: {train_data.shape}") # (3601, 500, 1) ``` ## Loading UEA Multivariate Datasets The load_UEA function loads multivariate time series classification datasets from the UEA archive. It processes ARFF files, applies standard scaling, and transforms labels to integer indices. ```python import datautils # Load the BasicMotions multivariate dataset train_data, train_labels, test_data, test_labels = datautils.load_UEA('BasicMotions') print(f"Train data shape: {train_data.shape}") # (40, 100, 6) - 6 channels print(f"Train labels shape: {train_labels.shape}") # (40,) print(f"Number of classes: {len(set(train_labels))}") # 4 # Load ArticularyWordRecognition train_data, train_labels, test_data, test_labels = datautils.load_UEA('ArticularyWordRecognition') print(f"ArticularyWordRecognition shape: {train_data.shape}") # (275, 144, 9) ``` ## Loading Forecasting Datasets The load_forecast_csv and load_forecast_npy functions load time series forecasting datasets with automatic train/validation/test splitting, time feature extraction, and normalization. ```python import datautils # Load ETTh1 dataset for multivariate forecasting data, train_slice, valid_slice, test_slice, scaler, pred_lens, n_covariate_cols = \ datautils.load_forecast_csv('ETTh1') print(f"Data shape: {data.shape}") # (1, 17420, 14) - 7 features + 7 time covariates print(f"Train slice: {train_slice}") # slice(None, 8640) print(f"Valid slice: {valid_slice}") # slice(8640, 11520) print(f"Test slice: {test_slice}") # slice(11520, 14400) print(f"Prediction lengths: {pred_lens}") # [24, 48, 168, 336, 720] print(f"Covariate columns: {n_covariate_cols}") # 7 # Extract training data train_data = data[:, train_slice] print(f"Training data shape: {train_data.shape}") # (1, 8640, 14) # Load for univariate forecasting data_univar, _, _, _, _, _, _ = datautils.load_forecast_csv('ETTh1', univar=True) print(f"Univariate data shape: {data_univar.shape}") # (1, 17420, 8) - 1 target + 7 covariates # Load electricity dataset (each variable as separate instance) data_elec, train_slice, valid_slice, test_slice, scaler, pred_lens, n_cov = \ datautils.load_forecast_csv('electricity') print(f"Electricity shape: {data_elec.shape}") # (321, 26304, 8) - 321 households ``` ## Loading Anomaly Detection Datasets The load_anomaly function loads anomaly detection datasets containing multiple time series with labels and timestamps. The gen_ano_train_data function prepares the data for model training. ```python import datautils # Load Yahoo anomaly detection dataset all_train_data, all_train_labels, all_train_timestamps, \ all_test_data, all_test_labels, all_test_timestamps, delay = \ datautils.load_anomaly('yahoo') print(f"Number of series: {len(all_train_data)}") print(f"Delay threshold: {delay}") # Examine individual series for key in list(all_train_data.keys())[:3]: print(f"Series {key}: train={len(all_train_data[key])}, test={len(all_test_data[key])}") # Generate training data for the model train_data = datautils.gen_ano_train_data(all_train_data) print(f"Training data shape: {train_data.shape}") # (num_series, max_length, 1) # Load KPI dataset all_train_data, all_train_labels, all_train_timestamps, \ all_test_data, all_test_labels, all_test_timestamps, delay = \ datautils.load_anomaly('kpi') ``` ## Classification Evaluation The eval_classification function evaluates time series representations on classification tasks using SVM, linear regression, or k-NN classifiers. It computes accuracy and AUPRC metrics. ```python import datautils import tasks from ts2vec import TS2Vec # Load dataset and train model train_data, train_labels, test_data, test_labels = datautils.load_UCR('ECG200') model = TS2Vec(input_dims=1, output_dims=320, device='cuda') model.fit(train_data, n_iters=200, verbose=True) # Evaluate with SVM classifier predictions, eval_results = tasks.eval_classification( model, train_data, train_labels, test_data, test_labels, eval_protocol='svm' ) print(f"Accuracy: {eval_results['acc']:.4f}") print(f"AUPRC: {eval_results['auprc']:.4f}") # Output: Accuracy: 0.8700, AUPRC: 0.9234 # Evaluate with linear classifier _, eval_results_linear = tasks.eval_classification( model, train_data, train_labels, test_data, test_labels, eval_protocol='linear' ) print(f"Linear accuracy: {eval_results_linear['acc']:.4f}") # Evaluate with k-NN classifier _, eval_results_knn = tasks.eval_classification( model, train_data, train_labels, test_data, test_labels, eval_protocol='knn' ) print(f"KNN accuracy: {eval_results_knn['acc']:.4f}") ``` ## Forecasting Evaluation The eval_forecasting function evaluates time series representations for forecasting tasks. It uses sliding inference with causal encoding and fits ridge regression models for multiple prediction horizons. ```python import datautils import tasks from ts2vec import TS2Vec # Load ETTh1 forecasting dataset data, train_slice, valid_slice, test_slice, scaler, pred_lens, n_covariate_cols = \ datautils.load_forecast_csv('ETTh1') # Train model on training data train_data = data[:, train_slice] model = TS2Vec( input_dims=train_data.shape[-1], output_dims=320, device='cuda', max_train_length=3000 ) model.fit(train_data, n_iters=200, verbose=True) # Evaluate forecasting performance out_log, eval_results = tasks.eval_forecasting( model, data, train_slice, valid_slice, test_slice, scaler, pred_lens, n_covariate_cols ) # Print results for each prediction length print("Forecasting Results (normalized):") for pred_len in pred_lens: mse = eval_results['ours'][pred_len]['norm']['MSE'] mae = eval_results['ours'][pred_len]['norm']['MAE'] print(f" Horizon {pred_len}: MSE={mse:.4f}, MAE={mae:.4f}") # Output: # Forecasting Results (normalized): # Horizon 24: MSE=0.0523, MAE=0.1734 # Horizon 48: MSE=0.0687, MAE=0.1956 # Horizon 168: MSE=0.1234, MAE=0.2567 # Horizon 336: MSE=0.1567, MAE=0.2934 # Horizon 720: MSE=0.2134, MAE=0.3456 print(f"\nInference time: {eval_results['ts2vec_infer_time']:.2f}s") ``` ## Anomaly Detection Evaluation The eval_anomaly_detection and eval_anomaly_detection_coldstart functions evaluate time series representations for detecting anomalies. They use representation difference with and without masking to identify anomalous points. ```python import datautils import tasks from ts2vec import TS2Vec # Load anomaly detection dataset all_train_data, all_train_labels, all_train_timestamps, \ all_test_data, all_test_labels, all_test_timestamps, delay = \ datautils.load_anomaly('yahoo') # Prepare training data train_data = datautils.gen_ano_train_data(all_train_data) # Train model model = TS2Vec( input_dims=1, output_dims=320, device='cuda' ) model.fit(train_data, n_iters=200, verbose=True) # Standard anomaly detection evaluation predictions, eval_results = tasks.eval_anomaly_detection( model, all_train_data, all_train_labels, all_train_timestamps, all_test_data, all_test_labels, all_test_timestamps, delay ) print("Anomaly Detection Results:") print(f" F1 Score: {eval_results['f1']:.4f}") print(f" Precision: {eval_results['precision']:.4f}") print(f" Recall: {eval_results['recall']:.4f}") print(f" Inference time: {eval_results['infer_time']:.2f}s") # Output: # F1 Score: 0.7234 # Precision: 0.6890 # Recall: 0.7612 # Inference time: 12.34s # Cold-start anomaly detection (no target-specific training data) train_data_coldstart, _, _, _ = datautils.load_UCR('FordA') model_coldstart = TS2Vec(input_dims=1, output_dims=320, device='cuda') model_coldstart.fit(train_data_coldstart, n_iters=200) predictions_cs, eval_results_cs = tasks.eval_anomaly_detection_coldstart( model_coldstart, all_train_data, all_train_labels, all_train_timestamps, all_test_data, all_test_labels, all_test_timestamps, delay ) print(f"\nCold-start F1: {eval_results_cs['f1']:.4f}") ``` ## Command-Line Training Interface The train.py script provides a command-line interface for training and evaluating TS2Vec models on various datasets. It supports classification, forecasting, and anomaly detection tasks with customizable hyperparameters. ```bash # Train and evaluate on UCR classification dataset python train.py ECG200 my_experiment \ --loader UCR \ --batch-size 8 \ --repr-dims 320 \ --gpu 0 \ --epochs 100 \ --eval # Train on UEA multivariate classification dataset python train.py BasicMotions experiment_uea \ --loader UEA \ --batch-size 16 \ --repr-dims 320 \ --gpu 0 \ --eval # Train for time series forecasting python train.py ETTh1 forecast_exp \ --loader forecast_csv \ --batch-size 8 \ --repr-dims 320 \ --max-train-length 3000 \ --gpu 0 \ --eval # Train for univariate forecasting python train.py ETTh1 univar_forecast \ --loader forecast_csv_univar \ --batch-size 8 \ --gpu 0 \ --eval # Train for anomaly detection python train.py yahoo anomaly_exp \ --loader anomaly \ --batch-size 8 \ --repr-dims 320 \ --gpu 0 \ --eval # Cold-start anomaly detection (transfer learning) python train.py yahoo coldstart_exp \ --loader anomaly_coldstart \ --batch-size 8 \ --gpu 0 \ --eval # Training with custom hyperparameters and checkpointing python train.py FordA custom_exp \ --loader UCR \ --batch-size 16 \ --lr 0.0005 \ --repr-dims 512 \ --max-train-length 1000 \ --iters 500 \ --save-every 100 \ --seed 42 \ --gpu 0 \ --eval # Handle irregular/missing data python train.py ECG200 irregular_exp \ --loader UCR \ --irregular 0.1 \ --gpu 0 \ --eval ``` ## TSEncoder Neural Network Architecture The TSEncoder class implements the encoder network using dilated convolutions for extracting hierarchical features from time series. It supports various masking strategies for contrastive learning and handles missing data gracefully. ```python from models import TSEncoder import torch # Initialize encoder directly (for advanced use cases) encoder = TSEncoder( input_dims=3, # Number of input features output_dims=320, # Representation dimension hidden_dims=64, # Hidden layer size depth=10, # Number of dilated conv layers mask_mode='binomial' # Default masking strategy ) # Forward pass with sample data batch_size, seq_len, features = 32, 100, 3 x = torch.randn(batch_size, seq_len, features) # During training (applies binomial mask automatically) encoder.train() output_train = encoder(x) print(f"Training output shape: {output_train.shape}") # (32, 100, 320) # During inference (no mask applied) encoder.eval() output_eval = encoder(x) print(f"Eval output shape: {output_eval.shape}") # (32, 100, 320) # Custom masking modes output_continuous = encoder(x, mask='continuous') # Continuous segment masking output_mask_last = encoder(x, mask='mask_last') # Mask only last timestamp output_all_true = encoder(x, mask='all_true') # No masking # Handle data with NaN values (missing observations) x_with_nan = x.clone() x_with_nan[0, 10:20, :] = float('nan') # Missing segment output_nan = encoder(x_with_nan) # Automatically handles NaN values ``` ## Hierarchical Contrastive Loss The hierarchical_contrastive_loss function implements the core training objective combining instance-level and temporal contrastive losses at multiple scales through max-pooling hierarchies. ```python from models.losses import hierarchical_contrastive_loss, instance_contrastive_loss, temporal_contrastive_loss import torch # Sample encoder outputs for two augmented views batch_size, seq_len, repr_dim = 16, 64, 320 z1 = torch.randn(batch_size, seq_len, repr_dim, requires_grad=True) z2 = torch.randn(batch_size, seq_len, repr_dim, requires_grad=True) # Compute hierarchical contrastive loss (default alpha=0.5) loss = hierarchical_contrastive_loss(z1, z2) print(f"Hierarchical loss: {loss.item():.4f}") # Custom alpha balances instance vs temporal contrast # alpha=1.0: only instance contrast, alpha=0.0: only temporal contrast loss_instance_heavy = hierarchical_contrastive_loss(z1, z2, alpha=0.8) loss_temporal_heavy = hierarchical_contrastive_loss(z1, z2, alpha=0.2) # Set temporal_unit to skip early levels for long sequences loss_long_seq = hierarchical_contrastive_loss(z1, z2, temporal_unit=2) # Individual loss components for analysis inst_loss = instance_contrastive_loss(z1, z2) temp_loss = temporal_contrastive_loss(z1, z2) print(f"Instance loss: {inst_loss.item():.4f}") print(f"Temporal loss: {temp_loss.item():.4f}") # Backpropagation loss.backward() print(f"Gradient computed: z1.grad.shape = {z1.grad.shape}") ``` ## Summary TS2Vec provides a comprehensive framework for learning universal time series representations through hierarchical contrastive learning. The primary use cases include: (1) time series classification where instance-level representations are extracted and fed to traditional classifiers like SVM or linear models, (2) time series forecasting using causal sliding-window encoding combined with ridge regression for multi-horizon predictions, and (3) anomaly detection by comparing masked and unmasked representations to identify abnormal patterns. The framework handles both univariate and multivariate time series, supports missing data through NaN handling, and scales to long sequences via sliding inference. Integration with existing workflows is straightforward through the modular API design. Users can leverage pre-built data loaders for popular benchmarks (UCR, UEA, ETT, Yahoo, KPI) or prepare custom datasets as numpy arrays with shape (n_instances, n_timestamps, n_features). The trained encoder produces fixed-dimensional representations that can be directly consumed by downstream models, enabling transfer learning across different time series tasks. For production deployments, models can be saved and loaded using standard PyTorch serialization, and the command-line interface facilitates experimentation with different hyperparameters and evaluation protocols.