Data Science, Machine Learning

ROCKET: Fast and Accurate Time Series Classification

State-of-the-art algorithm for time series classification with python

Sep 27 · 5 min read

“The task of time series classification can be thought of as involving learning or detecting signals or patterns within time series associated with relevant classes.” — Dempster, et al 2020, authors of ROCKET paper

Most time series classification methods with state-of-the-art (SOTA) accuracy have high computational complexity and scale poorly. This means they are slow to train on smaller datasets and effectively unusable on large datasets.

ROCKET (RandOM Convolutional KErnal Transform) can achieve the same level of accuracy in just a fraction of the time as competing SOTA algorithms, including convolutional neural networks. The algorithms were evaluated on the benchmark datasets in the UCR Archive.

ROCKET first transforms the time series dataset using random convolutional kernels, such as those used in a CNN, and then trains a linear classifier with these features.

How much faster is ROCKET? To train and test ROCKET on 85 benchmark datasets sequentially, it took 1 hour 40 min. For the same task, the next fastest SOTA algorithm (cBOSS) took 19 hours 33 minutes. For more comparisons on speed, see the paper.

In the remainder of this article, I will:

Discuss alternative time series classifiers
Explain how ROCKET works
Provide a python code example

What are the alternatives?

Other methods for time series classification usually rely on specific representations of series, such as shape, frequency, or variance. The convolutional kernels of ROCKET replace this engineered feature extraction with a single mechanism that can capture many of the same features.

Survey of time series classification

Time series transformation is a foundational idea of time series classification. Many time-series specific algorithms are compositions of transformed time series and conventional classification algorithms, such as those in scikit-learn.

For an introductory survey of time series classification algorithms, see my earlier article.

A Brief Survey of Time Series Classification Algorithms

Dedicated algorithms specifically designed for classifying time series

towardsdatascience.com

Competing SOTA methods

The following methods strive to improve upon the speed and accuracy of the algorithms described in the Survey above.

Proximity Forest is an ensemble of decision trees that are split on an elastic distance measure.
TS-CHIEF extends Proximity Forest by using dictionary-based and interval-based splitting criteria.
InceptionTime is an ensemble of 5 deep CNN’s based on the Inception architecture.
Mr-SEQL applies a linear classifier to features extracted by symbolic representations of time series (SAX, SFA).
cBOSS, or contractable BOSS, is a dictionary-based classifier based on the SFA transform.
catch22 is a set of 22 pre-selected time series transformations that can be passed to a classifier.

How does ROCKET work?

ROCKET first transforms a time series using convolutional kernels and second passes the transformed data to a linear classifier.

Convolutional Kernels

The convolutional kernels, the same as those found in convolutional neural networks, are initialized with random length, weights, bias, dilation, and padding. See the paper for how the random parameters are sampled — they are part of ROCKET and the sampling does not need to be tuned. The stride is always one. ROCKET does not apply non-linear transforms, such as ReLU, on the resulting features.

A guide to convolution arithmetic for deep learning

We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network…

arxiv.org

ROCKET uses a very large number of kernels — the default is 10,000. It is possible to use so many because the cost of computing convolutions is very low. This is due to the fact that the kernel weights are not “learned” and that there is only a single layer of convolutions.

Unlike typical CNN’s, ROCKET uses a variety of kernels. The random lengths, dilations, paddings, weights, and biases allow ROCKET to capture a wide range of information. In particular, the variety of kernel dilation allows ROCKET to capture patterns at different frequencies and scales.

These random kernels, in combination, are able to capture features relevant to time series classification. Alone, a single random convolutional kernel may only weakly capture a useful feature from a time series.

The Convolutional Kernel Transform

Each kernel is convolved with each time series to produce a feature map. The kernel’s feature map is aggregated to produce two features per kernel: the maximum value and proportion of positive values.

The maximum value feature is similar to the global max pooling.

The proportion of positive values indicates how to weight the prevalence of a pattern captured by the kernel. This value is the most critical element of ROCKET that contributes to its high accuracy.

Image for post — zi is the output of the convolution operation

Linear Classification

For smaller datasets, the authors recommend a ridge regression classifier due to fast cross-validation of the regularization parameter and no other hyperparameters.

Regularization is critical when the number of features exceeds the number of training examples, as is often the case with small datasets. (By default, ROCKET uses 10,000 kernels and generates two features per kernel, resulting in 20,000 features)

For large datasets, the authors recommend logistic regression with stochastic gradient descent due to scalability.

In “large” datasets, the number of training examples is much larger than the number of extracted features.

How to use ROCKET with Python?

The ROCKET transform is implemented in the sktime python package.

Sktime: a Unified Python Library for Time Series Machine Learning

Why? Existing tools are not well-suited to time series tasks and do not easily integrate together. Methods in the…

link.medium.com

The following code example is adapted from the sktime Demo of ROCKET Transform.

First, load the required packages.

import numpy as np
from sklearn.linear_model import RidgeClassifierCV
from sktime.datasets import load_arrow_head  # univariate dataset
from sktime.transformers.series_as_features.rocket import Rocket

Next set up the training and test data — in this case, I use the univariate ArrowHead series dataset for convenience. The Rocket transform can also be applied to multivariate data.

X_train, y_train = load_arrow_head(split="test", return_X_y=True)
X_test, y_test = load_arrow_head(split="train", return_X_y=True)
print(X_train.shape, X_test.shape) 
>> (175, 1) (36, 1)

Transform the training data using the Rocket transform. By default, ROCKET uses 10,000 kernels. In general, more kernels results in higher classification accuracy; however, there is a trade-off between increased accuracy and computation time. Even with a large number of kernels, ROCKET is still very fast.

rocket = Rocket(num_kernels=10,000, random_state=111) 
rocket.fit(X_train)
X_train_transform = rocket.transform(X_train)
X_train_transform.shape
>> (175, 20000)

Initialize and train a linear classifier from scikit-learn. The authors of sktime recommend using RidgeClassifierCV for smaller datasets (<20k training examples). For larger datasets, use logistic regression trained with stochastic gradient descent SGDClassifier(loss='log').

classifier = RidgeClassifierCV(alphas=np.logspace(-3, 3, 10), normalize=True)
classifier.fit(X_train_transform, y_train)

Finally, to score the trained model and generate predictions, transform the test data using Rocket and call the trained model.

X_test_transform = rocket.transform(X_test)
classifier.score(X_test_transform, y_test)
>> 0.9167

Citation

Dempster, A., Petitjean, F. & Webb, G.I. ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Disc 34, 1454–1495 (2020). https://doi.org/10.1007/s10618-020-00701-z

7 Python Tricks You Should Know (0)	2020.10.22
Advanced Python: Consider These 10 Elements When You Define Python Functions (0)	2020.10.21
The Beginner’s Guide to Pydantic (0)	2020.10.19
7 Commands in Python to Make Your Life Easier (0)	2020.10.18
Don’t Choose Python as Your First Programming Language (0)	2020.10.17

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

당신을 위한 Data Science