Savonia

Savonia Article Pro: The Importance of Time-Based Cyclic Features and Lag Features for Time Series Data

Savonia Article Pro is a collection of multidisciplinary Savonia expertise on various topics.

This work is licensed under CC BY-SA 4.0Creative Commons logoCreative Commons Attribution logoCreative Commons Share Alike logo

Introduction

Time series data is an essential component in various fields including finance, economics, energy, weather forecasting, and many more. It is important to include lag features and time-based cyclic features to extract meaningful insights from the data and to improve forecasting accuracy. These features improve the performance of AI/ML models by capturing the underlying patterns and dependencies in time-series data.

Understanding Time-Based Cyclic Features

Time-based cyclic features refer to patterns that repeat at regular intervals. These can be hourly, daily, weekly, monthly, or yearly cycles. For example, retail sales often see weekly patterns with higher sales on weekends. Similarly, temperature data has clear yearly cycles. Electricity consumption of a school has a daily cycle.

Cyclic features play a crucial role in time series analysis. They help in understanding periodic patterns and trends. Analysts can predict future values more accurately by identifying these cycles. For example, in stock market analysis, recognizing monthly or quarterly trends can be vital for making investment decisions.

Feature Engineering for Cyclic Patterns

Feature engineering involves creating new features from the existing data to enhance the model. For cyclic patterns, common techniques include:

Fourier Transforms: Used to convert time-domain signals into frequency-domain to identify periodic components.

Sine and Cosine Transformations: Applied to capture daily or seasonal cycles.

Dummy Variables: Representing different time periods such as months or days of the week.

Understanding Lag Features

Lag features, also known as lagged variables, are past values of the time series used as predictors in the model. For example, to forecast sales for the next month, the sales data for the previous months are considered as lag features.

Lag features are fundamental in time series analysis. They capture the temporal dependencies and autocorrelations within the data. By incorporating lag features, models can understand how past values influence future outcomes. This is particularly useful in scenarios where the past significantly impacts the future, such as in weather forecasting and economic indicators.

Feature Engineering for Lag Features

Creating lag features involves shifting the time series data by a specific number of time steps. Common techniques include:

Rolling Window Calculations: Involves computing statistics such as mean or sum over a rolling window of past observations.

Autoregressive (AR) Terms: Using past values of the same series as predictors in the model.

Moving Averages (MA): Including the average of past observations as a feature.

It is beneficial to combine both cyclic and lag features for optimal performance in time series forecasting. This approach ensures that the model captures both periodic patterns and temporal dependencies. For example, in electricity demand forecasting, models can leverage daily cycles, weekly patterns, and past demand data to make precise predictions.

Practical Application

In practical applications, combining these features involves:

Identifying significant cycles and creating corresponding cyclic features.

Determining relevant lag periods and generating lag features.

Normalizing features to ensure a consistent scale for the model.

Training the model using a combination of these engineered features.

ÄLLITÄ Project

In the Ällitä project we are using lag and time-based cyclic features in prediction algorithms and getting more reliable and accurate results. Below are two plots before and after using the lag and time-based features for the prediction algorithms. The introduction of these features also impacted the mean absolute error (MAE) which dropped from 4.66 to 2.16.

Taulukossa Before using lag & time-based features.

Figure 1 Before using lag & time-based features

Taulukossa After using lag & time-based features.

Figure 2 After using lag & time-based features

We can see from above plots that there is considerable improvement when we use lag & time-based features in our prediction models form time series data.

Conclusion

Including time-based cyclic features and lag features in time series data analysis is crucial for enhancing model performance. These features help in capturing underlying patterns and dependencies which leads to more accurate forecasts. These features are being used in ÄLLITÄ project for the forecasting of energy, and they improved the accuracy of models. Analysts can significantly improve their predictive models and make more informed decisions by understanding and applying these techniques.

Kuvituskuvassa logoja.

Authors:

Shahbaz Baig, RDI Specialist, DigiCenter, Savonia-ammattikorkeakoulu, shahbaz.baig@savonia.fi

Premton Canamusa, RDI Specialist, DigiCenter, Savonia-ammattikorkeakoulu, premton.canamusa@savonia.fi

Mika Leskinen, RDI Specialist, DigiCenter, Savonia-ammattikorkeakoulu, mika.leskinen@savonia.fi

Aki Happonen, Digital Development Manager, DigiCenter, Savonia-ammattikorkeakoulu, aki.happonen@savonia.fi

Laura Leppänen, RDI Specialist, Savonia-ammattikorkeakoulu Oy, laura.leppanen@savonia.fi