AIT Hestia Task Consumer profiling

Hestia Task 3.3: Consumer behaviour profiling service

Hestia Task 3.3 aims to deploy the infrastructure for the profiling of user consumption behaviour. This profiling will

enable devising a suitable consumer engagement strategy, and
serve as the basis for predicting the household consumption behaviour on the next day.

This webpage has two functions:

It dives into the pilot data (grouped on the apartment level) to assess the quality and readiness of the data for machine learning tasks.
It demonstrates the kinds of patterns that can be extracted from raw household electricity consumption time series.

The menu bar the top has two main groups:

The Data menu links to interactive dashboards for exploring apartment data from the Dutch and Italian pilots, as well as weather data from all the pilots. This gives indication on the quality and completeness of the collected data at a glance and per household.

The Clustering menu links to four different cluster analysis variants that differ as follows (detailed explanation below):
- Variant 1: household-days are normalized and there are 6 clusters.
- Variant 2: household-days are normalized and there are 12 clusters.
- Variant 3: households are normalized and there are 6 clusters.
- Variant 4: households are normalized and there are 12 clusters.
The clustering is carried out on a collection of items which we call household-days. The household-days arise from chopping up the single timeseries of each household into individual days, with each of 24-h piece then belonging to exactly one household on one day.

The number of clusters corresponds the number of different behaviour groups that the algorithm will try to assign daily consumptions to. There exist automatic methods for devising an optimal number of clusters for a particular dataset, but depending on our needs, we may want to have more or fewer than the “optimal” number. We opted to present the results of two fixed numbers of clusters to illustrate the differences that may arise.

Normalization is a common pre-processing step carried out before clustering. It aims at making the data more similar in a controlled way. The two normalization types alluded to in the above differ as follows:
- In the first type, we treat every household-day separately: we first calculate the mean of all of its values, and then divide the values by that mean. This way, every household-day has the same mean, so the clustering focuses solely on differences in the pattern.
- The second normalization type is simpler: we calculate the mean of all values belonging to a household, divide the values by that mean, and only then chop up the timeseries into household-days. This way, every household has the same mean, but household-days generally have different means, so the clustering may identify different patterns but also different overall levels of consumption.

Our Team

Milos Sipetic

Jan Kurzidim

Adam Buruzs

AIT Austrian Institute of Technology GmbH
Sustainable Thermal Energy Systems
Center for Energy
Giefinggasse 2 | 1210 Vienna | Austria