Friday, 27 January 2012

Unsupervised learning for NIALM

I've recently been thinking about various training methods for NIALM systems, specifically those which can be applied to unlabelled aggregate power demand data sampled once per minute (or less frequently). Assuming no prior information of the appliances or their usage patterns, this clearly falls into the category of unsupervised learning.

In unsupervised learning, the goal is often to determine the unknown structure of unlabelled data. However, in our case we don't simply want to construct a model which represents the aggregate power data. In fact, we want to build a model of the data in which appliances are explicitly represented. This way, once the learning process is complete, we can form the disaggregation task as an inference problem.

Previous unsupervised approaches to this problem have used clustering to identify unique behaviour of appliances. These approaches have been shown to work well when applied to multiple features extracted from high granularity data (sampled at kHz). However, in the case of low granularity data, there is no way to extract features such as reactive power, power factor, etc. and we are instead left with a single feature; (real) power.

To give a visual representation of how clustering might perform on real aggregate data sampled at 1 minute intervals, I ran some experiments on the REDD dataset. To do so, I did the following:

  1. Down sampled all data to 1 minute resolution
  2. Subtracted the power of each circuit from the household mains circuit to calculate the unallocated, or 'unknown', power
  3. Calculated the difference between consecutive power readings for each circuit
  4. Excluded any change in power less than 100 W
  5. Counted the power differences into bins for each circuit
  6. Plotted these bins as a stacked bar graph for each household
As an example, here's the chart for house 1:
You might want to click on the image to enlarge it since the inline resolution isn't great.

There are two key points to take from this plot:
  1. There are two unique clusters at the higher end of the power axis (labelled washer dryer and oven I think). These clusters would be easily identified by a clustering algorithm due to their clear separation from the other appliances.
  2. There are two clusters around the 1500 W mark (corresponding to the microwave and kitchen outlets I think). One cluster completely subsumes the other, making it very difficult or even impossible for a clustering algorithm to separate the two.
This is just one example, and although the appliances and their usage will be different across houses, I believe this trend will continue. There's always likely to be appliances with high power demands that are easily clustered, however, for appliances with lower power demands the corresponding clusters are increasingly more likely to overlap.

Although at first glance this might seem okay, because we're more interested in the appliances that consume the most energy. However, power demand and energy consumption are not always correlated. This is because power demand represents the rate of energy consumption, and therefore energy consumption depends of both the appliance's power demand and its duration of use. Two examples of appliance types with low power demands but high energy consumptions are the refrigerator and lighting. Because these appliances are on for such a long time, their energy consumption might turn out to be similar or even greater than kitchen white goods with the highest power demands.

I also generated the graphs for the other 5 houses in the data set, which I've included below (click to enlarge):


  1. Hi Oli,

    Do I understand correctly that here you are looking at classifying each sample individually, and that's why you have power as the only feature?

    I think that considering multiple samples in terms of a time series one could extract more features. Do you think you will look into that with this data set?
    (that of course would have lots of other problems, like overlapping profiles, but that's a different story..)

  2. Hi Enrico,

    Yes, you've hit the nail right on the head. I didn't actually carry out any classification experiments based on only power, but instead did this exercise more to highlight what is theoretically possible using power as a single feature.

    This post was meant more to justify the need to consider temporal features when disaggregating appliance use. Since events generated by the same appliance are clearly dependent, we therefore need to use models (such as Markov chains) which represent this.

  3. I found this useful - thanks for sharing this with me. As a lawyer researching smart meters and privacy, your graphs have been particularly illuminating!