Thursday, 20 June 2013

Comparison of public disaggregation data sets

Over the last year I've maintained a list of the public data sets which have been useful in my disaggregation research. However, I've found it's still quite time consuming to compare the finer details of the data sets, and I often end up trawling through papers or sifting through the data itself. For this reason, I've attempted to build a table which allows easy comparison between important attributes of each data set. As always, please leave a comment if you notice any errors!

Data set Institution Location Duration # houses # sub-meters per house Features Resolution
REDD MIT Boston, MA, USA 3-19 days 6 9-24 V, P aggregate, P sub-metered 15 kHz aggregate, 3 second sub-metered
BLUED CMU Pittsburgh, PA, USA 8 days 1 0 (manual switch event labels available) I, V 12 kHz
Smart* UMass Western Massachusetts, MA, USA 3 months 1 25 circuits, 29 appliance monitors P, S circuits, P appliance monitors 1 second circuits, various appliance monitors
Tracebase Darmstadt Germany N/A N/A N/A P 1-10 second
Sample data set Pecan Street Austin, TX, USA 7 days 10 12 S 1 minute
IHEPCDS EDF R&D France 4 years 1 3 I, V, P, Q 1 minute
HES UK DECC UK 1 month - 1 year 250 13-51 P 2 minute
AMPds Simon Fraser University Greater Vancouver, BC, Canada 1 year 1 19 I, V, pf, F, P, Q, S 1 minute

N.B. I don't maintain this table of comparison, so please see this post for an up-to-date list!

Sunday, 16 June 2013

AMPds data set released

Stephen Makonin recently released the first version of the Almanac of Minutely Power Data set. The data set contains 1 minute aggregate meter readings as well as sub-metered readings from 19 individual circuits. Each reading includes measurements of voltage, current, frequency, power factor, real power, reactive power and apparent power. Furthermore, the aggregate gas and water consumption was also measured at 1 minute intervals, in addition to 1 individual usage for each utility. The data set spans an entire year from April 2012 to March 2013 from a single household in the greater Vancouver area, BC, Canada. The data set is available to anyone for free, although the authors require a username and password to be requested for the purposes of usage tracking.

The authors of the data set have described collection process in more detail in the accompanying paper, as well as showing benchmark results of a method based on independent time slice combinatorial optimisation:

Stephen Makonin, Fred Popowich, Lyn Bartram, Bob Gill, and Ivan V. Bajic, AMPds: A Public Dataset for Load Disaggregation and Eco-Feedback Research, in Electrical Power and Energy Conference (EPEC), The Annual, pp. 1-6, 2013.

Wednesday, 12 June 2013

Appliances are not as smart as you think

Designing suitable energy disaggregation algorithms requires a fair amount of knowledge about household appliances. It's important to understand which features make it possible to differentiate between appliances, such as average power demand or duration of use. However, collecting suitably varied appliance data takes a lot of time and resources, while using existing data sets often neglects the study of how the appliances were actually used. This post collects a few examples of appliances which have lead me to conclude that many appliances are really not that smart.

Refrigerator


Originally, I expected a fridge's power demand to vary depending on its temperature set point, i.e. the cooler you set the temperature, the more power required to maintain that temperature. In fact, a fridge's cooling motor is only turned on or off according to a thermostat, and as a result always draws the same level of power while it is on, and therefore spends a larger proportion of each day cooling as opposed to idle.

Kettle


Similar to the fridge, I expected the power demand of a kettle to vary depending on the amount of water in the kettle. However, the power demand is always constant while it's boiling, and it's actually the time taken to boil the water which increases with the volume of water.

Oven


Following the same trend, electric ovens are also thermostatically controlled. As a result, they draw the same power irrelevant of the temperature set point. This means the heating element spends more time heating for higher temperatures, and leaves longer gaps between heating periods for lower temperatures.

Others


Spotted the trend yet? Other examples in this category include air conditioners, microwaves, electric hobs and irons. This is in addition the even dumber category of completely manually controlled appliances, such a lamp or fan.

Exceptions


Unfortunately, all appliances aren't this dumb. Some appliances have a power demand that slowly ramps up or down at the start or end of its usage. One appliance which is quite the opposite of those described above is the plasma television. These appliances draw a power demand which is proportional to the brightness of the screen, with a fully white screen requiring the maximum power draw and a black screen requiring the minimum power draw. Such appliances are not only hard to disaggregate, but also significantly complicate the process of disaggregating other simpler appliances.

Thursday, 6 June 2013

Data set collected by UK EST, DECC and DEFRA

In 2012, the UK Energy Savings TrustDepartment of Energy and Climate Change, and Department for Environment, Food and Rural Affairs published a 15 page report called Powering the Nation. This report summarises the full 600 page Household Electricity Use Study, which aimed to better understand how electricity is consumed in UK households. As part of this study, 251 owner-occupier households were monitored across England between April 2010 and April 2011. Of these households, 26 were monitored for 12 months, and 225 were monitored for 1 month. For each household, the energy consumption of 13-51 appliances was monitored at 2 minute intervals. A software portal is currently under development to provide access to the data set, although in the meantime the data can individually requested from ICF International by contacting efficient.products@icfi.com.

Wednesday, 29 May 2013

Data set released by EDF Energy

I've only just come across this data set, despite it being released almost a year ago! I've also updated my post of public data sets.

EDF Energy released a data set in 2012 containing energy measurements made at a single household in France for a duration of 4 years. Average measurements are available at 1 minute resolution of the household aggregate active power, reactive power, voltage and current, as well as the active power of 3 sub-metered circuits. Although each circuit contains a few appliances, this is the largest data set in terms of duration of measurement. The complete data set is openly available from the UCI Machine Learning Repository.

Saturday, 25 May 2013

The pros and cons of using HMMs to model appliances

In the last few years, hidden Markov models (HMMs) have become a very popular mathematical representation for appliances (Zia et al. 2010, Kim et al. 2011, Kolter et al 2012, Parson et al 2012). As a result, I'm often asked whether I think HMMs are the future of disaggregation. However, I'm yet to find an objective analysis of the advantages and disadvantages of such approaches, which is why I've done my best to list them here:

Advantages


  • The HMM is a well studied probabilistic graphic model, for which algorithms are known for exact and approximate learning and inference
  • HMMs are able to represent the variance of appliances' power demands through probability distributions
  • HMMs capture the dependencies between consecutive measurements, as defined by Hart as the switch continuity principle

Disadvantages


  • HMMs represent the behaviour of an appliance using a finite number of static distributions, and therefore fail to represent appliances with a continuously varying power demand
  • Due to their Markovian nature, they do not take into account the sequence of states leading into any given state
  • Again, due to their Markovian nature, the time spent in a given state is not captured explicitly. However, the hidden semi-Markov model does capture such behaviour
  • Features other than the observed power demand are not captured (e.g. time of day). However, the input-output HMM allow such such state durations to be modelled
  • Any dependency between appliances cannot be represented. However, the conditional-HMM can capture such dependencies

In summary, the basic HMM provides a useful model for many appliances. However, the appliances it can represent are limited by the intrinsic structure of the model. Many extensions exist that increase the representational power of the HMM, although the additional parameters required often complicate the learning and inference tasks.

Wednesday, 22 May 2013

AAAI 2012 Code Release

A while ago, I wrote a post stating that I was planning to release my NIALM code at the end of my PhD. I also mentioned in the post that I'd been happily giving out an archive of my code upon request. Since then, I've had far more requests than I'd expected, as well as quite a few technical questions regarding how to run it. As a result, I've decided to make my code from my AAAI 2012 paper available via my github for anyone to clone or contribute to.

The reason why I hadn't previously uploaded my code is that I simply do not have time to provide documentation or tutorials for using my code. Therefore, my code is provided "as is", so apologies in advance if you don't find it easy to use!

Update 15.09.2015: updated link to point to github