Anomaly Detection in a Time Series

September 21, 2021

Anomaly detection can be applied to a time series where we want to create a baseline model and determine the deviation of the observations with the baseline. From deviation, we can get information such that if it is large enough, the observation is deemed abnormal and is flagged.

In general, novelty and outlier detection does not tell us why something is possibly an outlier, but the conditions and causes led to an unusual observation. For example, generally, there are cases where we observe server logs; anomalous observations may result from some equipment or code breakdown or something malignant like a security breach. 

What should be your plan of attack be for analysing the time series?

So, generally, we visualize the data; after that, our goal is to generate the baseline model, so we can ask ourselves if there is any drift, and if present, we have to remove that from our data set.

The other thing we need to ask ourselves is some periodicity in our dataset. We have to find those and remove them from our model, so the residual is left behind; we can do some other modelling on those residuals. So this is the overall plan of attack.  

Fourier Analysis

It is hard to tell when there are periodic behaviours in the time series where we can better spot the dominant frequencies that support the time series using Fourier analysis. So here, I have taken a Fourier analysis code to give an idea about the dominant frequencies. 

From this graph, we got to see that these time series consist of four dominant frequencies, which are as follows:-

Daily
Twice-daily
Three times a day
Four times a day.

 If we rephrase it, it would be 6, 8, 12, and 24 hours. 

Output Graph:

From this graph, we got to see that this time series has four dominant frequencies: daily, twice-daily, three times a day, and four times a day. In other words, 6, 8, 12, and 24 hour periods.

Initial baseline model

Generally, we create an initial baseline model to get an idea of the data and know whether the initial baseline model is adequate for the time series. Generally, before using the initial baseline model, we create some custom transformers (if needed) to work with our pandas’ time-series data. Most custom transformer is used to generate Fourier components, transform Date Time objects into a unit of time, etc.

So after using the custom transformer and the initial baseline model, the residuals are generated to reveal whether the time series has a lot of shock events, a sudden increase in energy usage probably due to sudden and short use of products. So there’s a need to analyse the residuals for any temporal correlations. 

Download our ebooks

Get directly to your inbox

Noice Based Features

The first thing we want to unveil is the correlation of past residuals with current values. An autocorrelation plot will inform us whether the time series elements are positively correlated, independent of each other, or negatively correlated. In short, it tells the characteristic time scale of the process to guide us when generating noise-based features.  

z-Score

Since there is little temporal correlation with residual values, we assume that the residuals are independently sampled from the same distribution. Given this probabilistic perspective, we can quantify the degree of an anomaly to each observation if we know the distribution the residuals get sampled from.

If the distribution has one peak, there is a lower probability of observing values far from the peak. The z-score is a relative measure of how far away a value is from the mean, normalized by the standard deviation. 

How should we decide the appropriate z-score cutoff?

If we set the z-score large, it will increase the range of standard points or inliers, and if we set the z-score small, it will decrease the range of standard points. So if we view there is no perfect answer for this, we should consider that it all depends on the precision and recall we want in our analysis. 

Rolling z-score

The calculation of the z-score relied on the entire time series for calculating the mean and standard deviation. We will usually be streaming observations for anomaly detection with time series, and the entire series will not be available.

Instead, we can calculate the z-score on a window of observations rather than the whole time history. The advantage of rolling z-score is that not hold a large amount of data in memory, and it also reflects the fact that it is better to use current values. However, the observation is uncommon, and the rolling z-score is more adaptive to recent changes in the process. 

Conclusion 

So, we have learned about how anomaly detection gets used in a time series. A deep analysis would be a necessity while analysing every aspect of time series. We have learned how Fortier analysis, Z-score, initial baseline model, and noise base features can turn out to be so valuable while planning to analyse a time series. Moreover, the helpful method of finding out the cut-off of Z-score can be applied to the practical methods using anomaly detection time series.

To get the first free consultation for discussing Anomaly detection gets used in time series, click here.

Success Stories

Testimonials

AI TRANSFORMATION

DATA + AI

CLOUD + INFRASTRUCTURE

CYBERSECURITY

PRODUCT EXPERIENCE

MODERN BUSINESS APPS

Enterprise Pharma AI

Financial Services Challenges

Retail Services

Public Sector Services

Pharma Marketing Manufacturing

MANUFACTURING

Hosting Services

ENERGY + UTILITIES

Veeva Vault Support Services

FINANCIAL SERVICES + INSURANCE

Table of Contents

Anomaly Detection in a Time Series

Download our ebooks

Liked what you read !

Please leave a Feedback

Leave a Reply Cancel reply

Join the sustainability movement

Calculate Your DataOps ROI with Ease!

Related articles you may would like to read

How Cloud & AI Are Transforming Pharma Marketing Agencies

AI Knowledge Search Agents: Unlocking Enterprise Intelligence in Banking

Building a Secure, Vendor-Agnostic AI Foundation for Modern Banking

End to End Solution

Services

AI Powered Services

Professional Services

Partners

Resources