Wednesday, 31 July 2013

The academic reality gap

Over the past three years I've read a lot of academic papers on the topic of energy disaggregation. However, the thing that frustrates me the most are some of the assumptions that are made. Here are a some of the most common:

Simulated houses


In the absence of actual household aggregate or individual appliance power data, some researchers test their disaggregation algorithms using synthetic data. While this might be useful for simulating a range of appliances and households, it's often hard to infer how the performance would map onto a real household.

Houses with < 10 appliances


Instrumenting a house with appliance sub-meters is intrusive and expensive, so often only a subset of appliances are monitored. Generally in this case, an artificial aggregate is then calculated by summing the power demand of each appliance, which is subsequently used as the input to the disaggregation algorithm. However, since the difficulty of disaggregation increases with the number of appliances, disaggregating this artificial aggregate is generally much easier than disaggregating the household's true aggregate.

Training data


Unfortunately for us working on disaggregation algorithms, appliances of the same type can vary quite a lot from house to house. As a result, a lot of research sidesteps this problem by requiring training data from the house in which disaggregation will be performed. Training data normally comes in the form of sub-metered appliance data or a training phase in which appliances are operated sequentially and manually labelled. However, collecting training data clearly will not scale at the same rate that smart meter deployments have done.

Known appliance types


Even worse than not knowing what model of appliance is in each house, is not knowing which appliance types are present in each house. This is the most reasonable of these four assumptions, since it is conceivable a household's occupants might be required to enter this information if they're interested to see disaggregated data. However, I can't imagine many non-enthusiasts would be willing to (accurately) list all the electric appliances in their home.

I think the first two problems have largely been solved by the release of many public data sets over the past few years. However, I think the last two problems are up to each individual researcher to ensure they're studying a realistic scenario. The field of energy disaggregation is growing at an incredible rate, and I think now is the time to tackle the complete problem rather than to only study its individual components.

No comments:

Post a Comment