My name is Oliver Parson, and I'm currently employed as a Senior Data Scientist at Bulb. I'm interested in investigating the ways in which machine learning can be used to break down household energy consumption data into individual appliances, also known as Non-intrusive Appliance Load Monitoring (NILM) or energy disaggregation.
Thursday, 19 December 2013
Thesis defence complete!
Earlier this week I had to defend my PhD thesis against the critique of both an internal and external examiner in what is referred to as a viva. Despite my best efforts to prepare, I was still pretty nervous going into the exam. However, it turned out to be a lot more enjoyable than I had expected. My examiners managed to tease out the finer details of my thesis without being at all aggressive, which I think had a really positive effect on the atmosphere of the viva. The result of the viva was a number of minor corrections which I'll need to make to my thesis over the next couple of months. This means I'll hopefully make the final version of my thesis available online by the end of February 2014, but potentially sooner depending on how busy I am at the start of next year.
Saturday, 14 December 2013
Energy disaggregation research at MLSUST 2013 workshop
The 2013 workshop on Machine Learning for Sustainability was recently held at at the NIPS conference in Lake Tahoe, NV, USA. The workshop was organised by Edwin Bonilla, NICTA and ANU, Tom Dietterich, Oregon State University, Theodoros Damoulas, NYU CUSP and NYU-Poly and Andreas Krause, ETH Zurich. The workshop invited papers which propose and apply machine learning algorithms to solve sustainability problems such as climate change, energy management and biodiversity monitoring. The workshop featured two poster sessions, in which the authors of the accepted papers were invited to present their work. Both poster sessions featured a paper on energy disaggregation, which I have briefly summarised below.
Interleaved Factorial Non-Homogeneous Hidden Markov Models for Energy Disaggregation. Mingjun Zhong, Nigel Goddard, Charles Sutton.
This paper proposes a method for disaggregating 2 minute energy consumption data into individual appliances. The approach is based upon an extension of the factorial hidden Markov model (FHMM), in which the appliance transition probabilities are dependent upon the time of day (non-homogeneous), and also the appliances are constrained such that only one appliance can change state per time slice (interleaved). The authors evaluate their approach on 100 homes from the Household Electricity Study, in which 20-30 days of sub-metered data from each household is used for training, while 5-10 days of data is held out for testing. The results show that both the interleaved and non-homogeneous extensions individually provide better performance than the basic FHMM, while a combination of the two provides the best performance. Finally, the authors identify a key finding in that the disaggregation accuracy varies greatly across different households, and raise this as an open problem for the NIALM community.
Using Step Variant Convolutional Neural Networks for Energy Disaggregation. Bingsheng Wang, Haili Dong, Chang-Tien Lu.
This paper proposes a method for disaggregating 15 minute interval aggregate energy data into individual appliances. The approach is based on Step Variant Convolutional Neural Networks (SVCNN), which use the aggregate energy consumption in the intervals t-2, t-1, t, t+1, t+2 to predict the energy consumption of each individual appliance in interval t. The authors evaluate their approach via cross validation using REDD, in which 3 houses are used to train the model while 2 other houses are used to test the performance. The results show that the SVCNN model achieves greater accuracy than both discriminative sparse coding models and factorial hidden Markov models. However, the results still show a relatively high whole home normalised disaggregation error of approximately 0.8, confirming the difficulty of the disaggregation of 15 minute energy data.
Further details on both the REDD and HES data sets are available in my post summarising the publicly available NIALM data sets.
Friday, 6 December 2013
Accuracy metrics for NIALM
Accuracy metrics are essential when evaluating the performance of an energy disaggregation algorithm in a given scenario. However, each paper seems to use a different metric when comparing their proposed approach to the state of the art. As a result, it is impossible to compare numerical results between papers. I don't believe this is the fault of the authors, since there is no single accuracy metric which is unquestionably better than all other accuracy metrics. Instead, the relevance of each metric depends largely on the intended use of the disaggregated data.
For example, if the use of disaggregated data is to be used to provide a breakdown of the energy consumption in a home, an accuracy metric which allows errors to cancel out over time would be suitable. However, if the use of the disaggregated data is to be used to suggest appliance loads to be deferred to a different time of day, a less forgiving accuracy metric would be required.
Therefore, inspired by discussions at the EPRI NILM 2013 workshop and in my recent involvement in the foundation of an open source disaggregation toolkit, I have decided to collect and categorise a list of commonly used accuracy metrics as shown below.
This list is mostly intended as a starting point for discussion regarding accuracy metrics, so please leave a comment if you notice any metrics I've left out!
For example, if the use of disaggregated data is to be used to provide a breakdown of the energy consumption in a home, an accuracy metric which allows errors to cancel out over time would be suitable. However, if the use of the disaggregated data is to be used to suggest appliance loads to be deferred to a different time of day, a less forgiving accuracy metric would be required.
Therefore, inspired by discussions at the EPRI NILM 2013 workshop and in my recent involvement in the foundation of an open source disaggregation toolkit, I have decided to collect and categorise a list of commonly used accuracy metrics as shown below.
Event based metrics
Event based metrics assess how well a disaggregation algorithm detects appliance change events (e.g. washing machine turns on). However, it is not trivial to determine appliance events from sub-metered power data to be used as ground truth, and as a result often involve some subjective judgement (e.g. should a washing machine changing state mid-cycle from spin to drain constitute an event?). Furthermore, deciding whether a detected event matches a ground truth event is also not trivial (e.g. should a detected event that is 1 second apart from a ground truth event be matched?).
- True positives, false positives, false negatives, true negatives
- Confusion matrices
- True positive rate, false, negative rate, precision, recall, F-score
Non-event based metrics
Non-event based metrics assess how well a disaggregation algorithm is able to infer the power demand of individual appliances over time. As such, it is highly dependent upon the sampling rate of the sub-metered appliance data used as the ground truth. Such metrics have the advantage that sub-metered appliance data is easily collected by hardware installations, and requires little subjective judgement. However, non-event based metrics suffer from the disadvantage that disaggregation algorithms can score very highly by predicting all appliances to always draw zero power. This occurs as a result of most appliances remaining off for the majority of each day, and therefore the disaggregation algorithm is able to correctly predict each appliances power for the majority of each day.
Overall metrics
Overall metrics assess how well a disaggregation algorithm is able to infer the total energy consumed by individual appliances over a period of time. Such metrics are often the most intuitive, since they directly correspond to the pie chart of household energy consumption (e.g. as provided by Neurio). However, overall metrics allow errors to cancel out over time (e.g. an appliance's power is overestimated on day 1, while it is underestimated on day 2, resulting in the algorithm being assigned 100% accuracy since these errors cancel each other out).
This list is mostly intended as a starting point for discussion regarding accuracy metrics, so please leave a comment if you notice any metrics I've left out!
Monday, 2 December 2013
EPRI NILM 2013 Workshop in Palo Alto
I recently attended the two day 2013 NILM workshop hosted by EPRI at their Palo Alto office. The workshop brought together mostly utilities and vendors from the USA, although there were also attendees from government departments, non-profits and also some academics. The agenda was centred around discussions of data collection, use cases and future collaboration. Unfortunately, this meant there wasn't any discussion of algorithmic detail, given that vendors generally prefer to keep such information private.
In terms of outcomes to the workshop, two working groups were formed. One to study the performance metrics required to assess the accuracy of NIALM approaches, and a second to define a set of data output standards to ensure interoperability between multiple NIALM systems. As yet, I don't have any further information regarding either working group, but if you leave a comment on this post I'd be happy to forward your information onto the group leaders.
From my perspective, the most interesting thing I learned from this workshop was about the smart meter deployments in the states of California and Texas. It turns out that both deployments are already complete, and the existing infrastructure is capable of reporting the household power demand over the home area demand at roughly 10 second intervals. However, in order to activate this functionality, each household must request their utility to remotely flick a software switch to start the smart meter communicating with any compatible devices. This situation is particularly interesting to myself since it shares the same data rates and availability with the smart meters due to by deployed in the UK by 2020.
In terms of outcomes to the workshop, two working groups were formed. One to study the performance metrics required to assess the accuracy of NIALM approaches, and a second to define a set of data output standards to ensure interoperability between multiple NIALM systems. As yet, I don't have any further information regarding either working group, but if you leave a comment on this post I'd be happy to forward your information onto the group leaders.
From my perspective, the most interesting thing I learned from this workshop was about the smart meter deployments in the states of California and Texas. It turns out that both deployments are already complete, and the existing infrastructure is capable of reporting the household power demand over the home area demand at roughly 10 second intervals. However, in order to activate this functionality, each household must request their utility to remotely flick a software switch to start the smart meter communicating with any compatible devices. This situation is particularly interesting to myself since it shares the same data rates and availability with the smart meters due to by deployed in the UK by 2020.
Tuesday, 12 November 2013
Belkin Energy Disaggregation Competition - Completed
A while ago, I wrote a post about the Belkin Energy Disaggregation Competition, which concluded a couple of weeks ago. The competition drew entries from 165 participating teams, who each provided up to 169 submissions. The top 3 participants shared prizes from a combined pot of $24,000, while a separate data visualisation competition had a prize of $1,000.
It seemed like most entrants were regular Kagglers, with little participation from NIALM companies or academics with the NIALM field. Although I understand that many companies are likely unwilling to participate to prevent their secrets being divulged, I wonder if greater participation from the existing NIALM academic field could have been achieved by hosting the competition at a relevant conference or workshop. I would love to see the winners of the competition invited give a talk about their approaches!
The data used in the competition consisted of a public (training) and a private (test) data set, collected from 4 households. The training data included both household aggregate data and individual appliance sub-metered data. However, only the aggregate test data was released, while the sub-metered data was kept private for the evaluation of each submission. As a result, although a cross-validation evaluation technique was used, the participants were crucially not required to generalise to new households since sub-metered data was available from each test household for training.
With each submission, the public leaderboard was updated showing the best performance for each user over half of the private test data, while their undisclosed performance over the other half of the test data was used to calculate the final standings. Interestingly, the winner of the competition shown by the final standings was actually only ranked 6th on the public leaderboard. This suggests that many participants might have been overfitting their algorithms to the half of the test data for which the performance was disclosed, while the competition winner had not optimised their approach in such a way.
An interesting forum thread seems to show that most successful participants used an approach based on only low-frequency data, despite the fact that high-frequency data was also provided. This seems to contradict most academic research, which generally shows that high-frequency based approaches will outperform low-frequency methods. A reason for this could be that, although high-frequency based approaches perform well in laboratory test environments, their features do not generalise well over time, and as a result algorithm training quickly becomes outdated. However, another reason could have been that the processing of the high-frequency features was simply too time consuming, and better performance could be achieved by concentrating on the low-frequency data given the deadline of the competition.
Overall, I think the competition was very successful in provoking interest in energy disaggregation from a new community, and I hope that any follow up competitions follow a similar format. Furthermore, I think that hosting a prize giving and presentation forum at a relevant conference and workshop would inspire greater participation from academics already working in the field in NIALM.
It seemed like most entrants were regular Kagglers, with little participation from NIALM companies or academics with the NIALM field. Although I understand that many companies are likely unwilling to participate to prevent their secrets being divulged, I wonder if greater participation from the existing NIALM academic field could have been achieved by hosting the competition at a relevant conference or workshop. I would love to see the winners of the competition invited give a talk about their approaches!
The data used in the competition consisted of a public (training) and a private (test) data set, collected from 4 households. The training data included both household aggregate data and individual appliance sub-metered data. However, only the aggregate test data was released, while the sub-metered data was kept private for the evaluation of each submission. As a result, although a cross-validation evaluation technique was used, the participants were crucially not required to generalise to new households since sub-metered data was available from each test household for training.
With each submission, the public leaderboard was updated showing the best performance for each user over half of the private test data, while their undisclosed performance over the other half of the test data was used to calculate the final standings. Interestingly, the winner of the competition shown by the final standings was actually only ranked 6th on the public leaderboard. This suggests that many participants might have been overfitting their algorithms to the half of the test data for which the performance was disclosed, while the competition winner had not optimised their approach in such a way.
An interesting forum thread seems to show that most successful participants used an approach based on only low-frequency data, despite the fact that high-frequency data was also provided. This seems to contradict most academic research, which generally shows that high-frequency based approaches will outperform low-frequency methods. A reason for this could be that, although high-frequency based approaches perform well in laboratory test environments, their features do not generalise well over time, and as a result algorithm training quickly becomes outdated. However, another reason could have been that the processing of the high-frequency features was simply too time consuming, and better performance could be achieved by concentrating on the low-frequency data given the deadline of the competition.
Overall, I think the competition was very successful in provoking interest in energy disaggregation from a new community, and I hope that any follow up competitions follow a similar format. Furthermore, I think that hosting a prize giving and presentation forum at a relevant conference and workshop would inspire greater participation from academics already working in the field in NIALM.
Tuesday, 5 November 2013
Thesis Finished!
Today I finally finished my thesis titled: 'Unsupervised Training Methods for Non-intrusive Appliance Load Monitoring from Smart Meter Data'. My viva (defence) is scheduled for December, and I plan to upload a final version upon addressing any revisions it brings up. In the meantime, I've included the thesis abstract below:
Non-intrusive appliance load monitoring (NIALM) is the process of disaggregating a household's total electricity consumption into its contributing appliances. Smart meters are currently being deployed on national scales, providing a platform to collect aggregate household electricity consumption data. Existing approaches to NIALM require a manual training phase in which either sub-metered appliance data is collected or appliance usage is manually labelled. This training data is used to build models of the household appliances, which are subsequently used to disaggregate the household's electricity data. Due to the requirement of such a training phase, existing approaches do not scale automatically to the national scales of smart meter data currently being collected.
In this thesis we propose an unsupervised training method which, unlike existing approaches, does not require a manual training phase. Instead, our approach combines general appliance knowledge with just aggregate smart meter data from the household to perform disaggregation. To do so, we address the following three problems: (i) how to generalise the behaviour of multiple appliances of the same type, (ii) how to tune general knowledge of appliances to the specific appliances within a single household using only smart meter data, and (iii) how to provide actionable energy saving advice based on the tuned appliance knowledge.
First, we propose an approach to the appliance generalisation problem, which uses the Tracebase data set to build probabilistic models of household appliances. We take a Bayesian approach to modelling appliances using hidden Markov models, and empirically evaluate the extent to which they generalise to previously unseen appliances through cross validation. We show that learning using multiple appliances vastly outperforms learning from a single appliance by 61-99% when attempting to generalise to a previously unseen appliance, and furthermore that such general models can be learned from only 2-6 appliances.
Second, we propose an unsupervised solution to the model tuning problem, which uses only smart meter data to learn the behaviour of the specific appliances in a given household. Our approach uses general appliance models to extract appliance signatures from a household's smart meter data, which are then used to refine the general appliance models. We evaluate the benefit of this process using the Reference Energy Disaggregation Data set, and show that the tuned appliance models more accurately represent the energy consumption behaviour of a given household's appliances compared to when general appliance models are used, and furthermore that such general models can perform comparably to when sub-metered data is used for model training. We also show that our tuning approach outperforms the current state of the art, which uses a factorial hidden Markov model to tune the general appliance models.
Third, we apply both of these approaches to infer the energy efficiency of refrigerators and freezers in a data set of \117 households. We evaluate the accuracy of our approach, and show that it is able to successfully infer the energy efficiency of combined fridge freezers. We then propose an extension to our model tuning process using factorial hidden semi-Markov models to model households with a separate fridge and freezer. Finally, we show that through this extension our approach is able to simultaneously tune the appliance models of both appliances.
The above contributions provide a solution which satisfies the requirements of a NIALM training method which is both unsupervised (no manual interaction required during training) and uses only smart meter data (no installation of additional hardware is required). When combined, the contributions presented in this thesis represent an advancement in the state of the art in the field of non-intrusive appliance load monitoring, and a step towards increasing the efficiency of energy consumption within households.
Non-intrusive appliance load monitoring (NIALM) is the process of disaggregating a household's total electricity consumption into its contributing appliances. Smart meters are currently being deployed on national scales, providing a platform to collect aggregate household electricity consumption data. Existing approaches to NIALM require a manual training phase in which either sub-metered appliance data is collected or appliance usage is manually labelled. This training data is used to build models of the household appliances, which are subsequently used to disaggregate the household's electricity data. Due to the requirement of such a training phase, existing approaches do not scale automatically to the national scales of smart meter data currently being collected.
In this thesis we propose an unsupervised training method which, unlike existing approaches, does not require a manual training phase. Instead, our approach combines general appliance knowledge with just aggregate smart meter data from the household to perform disaggregation. To do so, we address the following three problems: (i) how to generalise the behaviour of multiple appliances of the same type, (ii) how to tune general knowledge of appliances to the specific appliances within a single household using only smart meter data, and (iii) how to provide actionable energy saving advice based on the tuned appliance knowledge.
First, we propose an approach to the appliance generalisation problem, which uses the Tracebase data set to build probabilistic models of household appliances. We take a Bayesian approach to modelling appliances using hidden Markov models, and empirically evaluate the extent to which they generalise to previously unseen appliances through cross validation. We show that learning using multiple appliances vastly outperforms learning from a single appliance by 61-99% when attempting to generalise to a previously unseen appliance, and furthermore that such general models can be learned from only 2-6 appliances.
Second, we propose an unsupervised solution to the model tuning problem, which uses only smart meter data to learn the behaviour of the specific appliances in a given household. Our approach uses general appliance models to extract appliance signatures from a household's smart meter data, which are then used to refine the general appliance models. We evaluate the benefit of this process using the Reference Energy Disaggregation Data set, and show that the tuned appliance models more accurately represent the energy consumption behaviour of a given household's appliances compared to when general appliance models are used, and furthermore that such general models can perform comparably to when sub-metered data is used for model training. We also show that our tuning approach outperforms the current state of the art, which uses a factorial hidden Markov model to tune the general appliance models.
Third, we apply both of these approaches to infer the energy efficiency of refrigerators and freezers in a data set of \117 households. We evaluate the accuracy of our approach, and show that it is able to successfully infer the energy efficiency of combined fridge freezers. We then propose an extension to our model tuning process using factorial hidden semi-Markov models to model households with a separate fridge and freezer. Finally, we show that through this extension our approach is able to simultaneously tune the appliance models of both appliances.
The above contributions provide a solution which satisfies the requirements of a NIALM training method which is both unsupervised (no manual interaction required during training) and uses only smart meter data (no installation of additional hardware is required). When combined, the contributions presented in this thesis represent an advancement in the state of the art in the field of non-intrusive appliance load monitoring, and a step towards increasing the efficiency of energy consumption within households.
Monday, 21 October 2013
Neurio - a new energy disaggregation product looking for Kickstarter funding
Jack Kelly recently linked me to an exciting new company called Energy Aware. Energy Aware are currently seeking Kickstarter funding to develop their electricity disaggregation technology Neurio, which consists of:
Two things strike me that set Neurio apart from the competition:
- a hardware sensor featuring two CT clamps, capable of reporting voltage, current, real power and power factor at 1 second intervals
- a set of cloud-based disaggregation algorithms which breakdown your household electricity usage into individual appliances
Two things strike me that set Neurio apart from the competition:
- Interconnectivity with third party services - Neurio are keen to connect the detected appliance switch events (e.g. lights turned on) to any third party services through their open RESTful API.
- Real-time notifications - The company will also provide a mobile app, with the aim of notifying their users when an appliance (e.g. oven) is left on.
Monday, 23 September 2013
BuildSys 2013 Interesting Papers
The 5th ACM Workshop On Embedded Systems For Energy-Efficient Buildings (BuildSys) is coming up in November and I wanted to share a few papers which have recently been accepted there. Although the camera ready submissions weren't due at the time of writing, I've managed to get hold of a pre-print of a few interesting papers, which the authors are happy for me to share.
- Towards a Smart Home Framework, Muddasser Alam, Alper T. Alan, Alex Rogers, and Sarvapali D. Ramchurn. Agents, Interaction and Complexity Research Group, University of Southampton, UK.
- This paper presents the Smart Home Framework simulation platform for modelling smart homes. The platform provides extendable building blocks for smart households, such as micro-generation, energy storage, in addition to the components of more traditional homes, such as household electronics and heating. The framework has been designed to easily enable the simulation of different household environments in order to test the potential for different smart technologies.
- It’s Different: Insights into home energy consumption in India, Nipun Batra, Manoj Gulati, Amarjeet Singh, Mani B. Srivastava. Indraprastha Institute of Information Technology, Delhi, India & University of California Los Angeles, United States.
- This paper presents a new data set called Home Deployment, collected from a single household in Delhi. The authors describe many factors which distinguish the data set from other data sets collected from developed countries, such as the unreliability of the electrical grid and Internet connectivity. The data spans 73 days, collected from household-level, circuit-level, and appliance-level meters.
- A Scalable Low-Cost Solution to Provide Personalised Home Heating Advice to Households, Alex Rogers, Siddhartha Ghosh, Reuben Wilcock and Nicholas R. Jennings. University of Southampton, UK.
- This paper presents MyJoulo, a low-cost hardware solution which provides personalised home heating advice to households. The system consists of a single temperature logger which is placed on top of a household's thermostat, which is able to learn the thermal properties of a household. The thermal model is then used to provide feedback to the household occupants by comparing your learned thermostat set point to the national average, in addition to estimating the the potential savings should the set point be reduced or the timer settings changed.
Thursday, 29 August 2013
Disaggregation in the UK
I was recently asked about smart meter legislation in UK, and its direction with respect to energy disaggregation. Here was my response:
As far as I'm aware, UK smart meters are not required to provide appliance specific electricity breakdowns. Instead, they're being installed primarily for automatic billing purposes, but also to provide total household consumption information and potentially real-time pricing data via in home displays. Some general information about UK smart meters is available from the UK Government, and the latest technical specification for smart meters is the Smart Metering Equipment Technical Specification v2.
In the UK, most consumers are signed up to electricity contracts with one of the big 6 energy suppliers. However, British Gas, the largest of these suppliers, recently started running a TV advert in which their smart meters were shown to break down their households' electricity usage into heating and lighting. In my opinion, this shows that although appliance specific breakdowns are not required by government directives, energy suppliers are keen to provide such services in order to incentivise consumers. I think breaking down electricity usage into heating and lighting is only the beginning of what electricity disaggregation can offer to consumers, and we're likely to see some interesting competition in this domain between some of the major players in the UK energy sector.
As far as I'm aware, UK smart meters are not required to provide appliance specific electricity breakdowns. Instead, they're being installed primarily for automatic billing purposes, but also to provide total household consumption information and potentially real-time pricing data via in home displays. Some general information about UK smart meters is available from the UK Government, and the latest technical specification for smart meters is the Smart Metering Equipment Technical Specification v2.
In the UK, most consumers are signed up to electricity contracts with one of the big 6 energy suppliers. However, British Gas, the largest of these suppliers, recently started running a TV advert in which their smart meters were shown to break down their households' electricity usage into heating and lighting. In my opinion, this shows that although appliance specific breakdowns are not required by government directives, energy suppliers are keen to provide such services in order to incentivise consumers. I think breaking down electricity usage into heating and lighting is only the beginning of what electricity disaggregation can offer to consumers, and we're likely to see some interesting competition in this domain between some of the major players in the UK energy sector.
Friday, 23 August 2013
Incorporating general appliance knowledge into disaggregation algorithms
I've recently been thinking a lot about how to incorporate prior knowledge into disaggregation training algorithms. By prior knowledge, I mean general information about how appliance types operate, i.e. what makes a fridge different from a washing machine. However, this prior knowledge should not be specific to a single household, and should therefore generalise to previously unseen households. Clearly, such prior knowledge is required if a NIALM system is to operate without manual intervention. To date, I have seen two categories of approaches which incorporate prior information into the learning process, which I have described below.
This approach first aims to identify the characteristics of the appliances in a household without any prior knowledge. This produces a list of appliances (e.g. appliance 1, appliance 2) with their corresponding behaviour. Next, this category of approaches use a second step to assign labels to the learned appliances (e.g. appliance 1 = fridge, appliance 2 = washing machine). A diagram of this approach is given below:
This approach has been adopted in some recent state-of-the-art disaggregation papers. Both Kim et al. (2011) and Kolter et al. (2012) both use an unsupervised learning approach to the first step, while the manual labelling of learned appliance models is required by the second step.
This approach aims to use both aggregate data and prior knowledge in order to simultaneously learn appliance models, as shown below:
This seems to be a more principled approach, in that prior knowledge is not ignored when the learning algorithm identifies distinct appliances within the aggregate load. We demonstrated such an approach in a recent paper (Parson et al., 2012). Furthermore, this type of approach lends itself well to Bayesian learning, whereby the learned appliances models constitute a weighted combination of information extracted from aggregate data and the general appliance models. Such an approach is detailed in Johnson and Willsky (2013).
In summary, I believe it is essential to incorporate such general appliance knowledge into the learning algorithms of energy disaggregation systems to allow them to scale to large numbers of previously unseen households. Only in the past few years has published work come close to Hart's vision of a Manual Setup NIALM (Hart, 1992), but the problem is still far from solved.
Sequential learning and labelling
This approach first aims to identify the characteristics of the appliances in a household without any prior knowledge. This produces a list of appliances (e.g. appliance 1, appliance 2) with their corresponding behaviour. Next, this category of approaches use a second step to assign labels to the learned appliances (e.g. appliance 1 = fridge, appliance 2 = washing machine). A diagram of this approach is given below:
This approach has been adopted in some recent state-of-the-art disaggregation papers. Both Kim et al. (2011) and Kolter et al. (2012) both use an unsupervised learning approach to the first step, while the manual labelling of learned appliance models is required by the second step.
Simultaneous learning and labelling
This approach aims to use both aggregate data and prior knowledge in order to simultaneously learn appliance models, as shown below:
This seems to be a more principled approach, in that prior knowledge is not ignored when the learning algorithm identifies distinct appliances within the aggregate load. We demonstrated such an approach in a recent paper (Parson et al., 2012). Furthermore, this type of approach lends itself well to Bayesian learning, whereby the learned appliances models constitute a weighted combination of information extracted from aggregate data and the general appliance models. Such an approach is detailed in Johnson and Willsky (2013).
In summary, I believe it is essential to incorporate such general appliance knowledge into the learning algorithms of energy disaggregation systems to allow them to scale to large numbers of previously unseen households. Only in the past few years has published work come close to Hart's vision of a Manual Setup NIALM (Hart, 1992), but the problem is still far from solved.
Friday, 9 August 2013
Outstanding student paper award at IJCAI-13
A while ago, I wrote a blog post about our paper which aims to detect the New Forest Cicada from smartphone audio recordings. Davide Zilli recently presented this work in Beijing, and I'm very happy to announce that paper was awarded the outstanding student paper award. Although we're yet to rediscover the cicada native to the UK, the app has collected thousands of audio recordings worldwide and located similar species of cicada in nearby countries. There are still a few more days left of the cicada season in the UK, so check out the New Forest Cicada Project website for more details if you're thinking of heading to the New Forest.
The New Forest Cicada |
Monday, 5 August 2013
IJCAI-13 tutorial on Topics in Computational Sustainability
Carla Gomes and Zico Kolter very recently organised a computational sustainability tutorial at IJCAI-13. The tutorial was centred around three sustainability problems: energy generation and demand forecasting, energy disaggregation, and control of power networks. I found it really interesting to see these AI and sustainability problems all grouped together, as a question I've discussed with Jack before is: "what problem problem should I work on in order to have the greatest impact on sustainability?" Furthermore, Zico has made some of the data and MATLAB code used for some of the demonstrations available online, which would be a great starting point for a dissertation project.
Wednesday, 31 July 2013
The academic reality gap
Over the past three years I've read a lot of academic papers on the topic of energy disaggregation. However, the thing that frustrates me the most are some of the assumptions that are made. Here are a some of the most common:
In the absence of actual household aggregate or individual appliance power data, some researchers test their disaggregation algorithms using synthetic data. While this might be useful for simulating a range of appliances and households, it's often hard to infer how the performance would map onto a real household.
I think the first two problems have largely been solved by the release of many public data sets over the past few years. However, I think the last two problems are up to each individual researcher to ensure they're studying a realistic scenario. The field of energy disaggregation is growing at an incredible rate, and I think now is the time to tackle the complete problem rather than to only study its individual components.
Simulated houses
Houses with < 10 appliances
Instrumenting a house with appliance sub-meters is intrusive and expensive, so often only a subset of appliances are monitored. Generally in this case, an artificial aggregate is then calculated by summing the power demand of each appliance, which is subsequently used as the input to the disaggregation algorithm. However, since the difficulty of disaggregation increases with the number of appliances, disaggregating this artificial aggregate is generally much easier than disaggregating the household's true aggregate.
Training data
Unfortunately for us working on disaggregation algorithms, appliances of the same type can vary quite a lot from house to house. As a result, a lot of research sidesteps this problem by requiring training data from the house in which disaggregation will be performed. Training data normally comes in the form of sub-metered appliance data or a training phase in which appliances are operated sequentially and manually labelled. However, collecting training data clearly will not scale at the same rate that smart meter deployments have done.
Known appliance types
Even worse than not knowing what model of appliance is in each house, is not knowing which appliance types are present in each house. This is the most reasonable of these four assumptions, since it is conceivable a household's occupants might be required to enter this information if they're interested to see disaggregated data. However, I can't imagine many non-enthusiasts would be willing to (accurately) list all the electric appliances in their home.
I think the first two problems have largely been solved by the release of many public data sets over the past few years. However, I think the last two problems are up to each individual researcher to ensure they're studying a realistic scenario. The field of energy disaggregation is growing at an incredible rate, and I think now is the time to tackle the complete problem rather than to only study its individual components.
Tuesday, 23 July 2013
Yetu and Verdigris added to list of disaggregation companies
I've recently come across Yetu and Verdigris, two companies aiming to make homes smarter. Yetu aim to wirelessly connect various components of smart homes together, such as electricity storage and micro-generation, as well as normal household appliances. Verdigris focus upon metering hardware and real-time appliance fault diagnosis. Both companies' products are built around core software disaggregation technology. I've also updated my ever-growing list of disaggregation companies with these two newcomers.
Monday, 22 July 2013
Would you replace your fridge for £82 per year?
A couple of weeks ago I gave a presentation to my research group giving an overview of my PhD research. I concluded with a slide showing some advice that could be provided to a real household from our most recent deployment. The slide showed that this household could save £82 per year by replacing their old inefficient fridge freezer with a new energy-efficient appliance. This was one of the least efficient fridge freezers we had come across, so I was pretty happy with the financial incentive for replacing it. However, the first question I received after the presentation was:
"The reward for replacing the appliance seems quite small, so how can you still motivate people to save energy? I know that wouldn't make me replace my fridge."
This really made me question the kind of feedback which disaggregation research aims to provide. My approach has always been to quantify the reward of some action, therefore empowering a household's occupants to make an informed decision, rather than making the decision for them. The reason for this being only the human can weigh up the inconvenience against the financial reward. I would guess that some people would replace their fridge given the same savings, and others would not. Personally, I'd replace my fridge for £82 per year, but would you?
"The reward for replacing the appliance seems quite small, so how can you still motivate people to save energy? I know that wouldn't make me replace my fridge."
This really made me question the kind of feedback which disaggregation research aims to provide. My approach has always been to quantify the reward of some action, therefore empowering a household's occupants to make an informed decision, rather than making the decision for them. The reason for this being only the human can weigh up the inconvenience against the financial reward. I would guess that some people would replace their fridge given the same savings, and others would not. Personally, I'd replace my fridge for £82 per year, but would you?
Friday, 12 July 2013
Belkin Energy Disaggregation Competition
I've just come across an energy disaggregation competition set up by Belkin on the Kaggle platform. The competition focuses on the disaggregation of high frequency data, from which Belkin provide the following features:
The competition supplies this data in two sets:
This idea is for participants to train their disaggregation algorithms on the training set, and upload the result of their disaggregation algorithms on the test set. Participants will then receive a score reflecting the accuracy of their algorithm's output. The deadline for the competition is 30 October 2013, and the top prize is $14,000, so get busy disaggregating!
- Spectrogram of high frequency noise
- Fundamental and first 5 current harmonics on each phase
- Fundamental and first 5 voltage harmonics on each phase
The competition supplies this data in two sets:
- Training set - includes both aggregate features and appliance ground truth
- Test set - includes only aggregate features
This idea is for participants to train their disaggregation algorithms on the training set, and upload the result of their disaggregation algorithms on the test set. Participants will then receive a score reflecting the accuracy of their algorithm's output. The deadline for the competition is 30 October 2013, and the top prize is $14,000, so get busy disaggregating!
Saturday, 6 July 2013
Crowdsourcing gas leaks with iSmellGas
I've recently become interested in crowdsourcing platforms, and how such methodologies can be applied to sustainability issues. One thing I noticed was how often I smelt the smell of gas while walking around cities. I spoke to my friends who also seemed to have smelt gas leaks, although no one had got around to reporting them. This inspired myself and a friend to create iSmellGas, a platform for crowdsourcing the location and strength of gas leaks. The idea is that gas leaks can be reported via the Android mobile app, and collected online via our web app.
Please take a look at our website and share it in any way you like, and if you have an Android phone, download our app and start reporting!
Please take a look at our website and share it in any way you like, and if you have an Android phone, download our app and start reporting!
Friday, 5 July 2013
Global map of energy disaggregation research
Keeping track of who is working within a research field is a tough task, especially when it's expanding at the rate of the energy disaggregation domain. A while ago, I created a map of the hits to my blog grouped by individual countries. However, I can't imagine the map was particularly useful, given that it wasn't possible to drill down beyond the country level. For this reason, I decided to set up a Google map to allow anyone to add their own institution or company to the list. The map is editable by anyone, so please feel free to add whatever information about your research you wish.
View Energy disaggregation research institutions in a larger map
Thursday, 20 June 2013
Comparison of public disaggregation data sets
Over the last year I've maintained a list of the public data sets which have been useful in my disaggregation research. However, I've found it's still quite time consuming to compare the finer details of the data sets, and I often end up trawling through papers or sifting through the data itself. For this reason, I've attempted to build a table which allows easy comparison between important attributes of each data set. As always, please leave a comment if you notice any errors!
Data set | Institution | Location | Duration | # houses | # sub-meters per house | Features | Resolution |
REDD | MIT | Boston, MA, USA | 3-19 days | 6 | 9-24 | V, P aggregate, P sub-metered | 15 kHz aggregate, 3 second sub-metered |
BLUED | CMU | Pittsburgh, PA, USA | 8 days | 1 | 0 (manual switch event labels available) | I, V | 12 kHz |
Smart* | UMass | Western Massachusetts, MA, USA | 3 months | 1 | 25 circuits, 29 appliance monitors | P, S circuits, P appliance monitors | 1 second circuits, various appliance monitors |
Tracebase | Darmstadt | Germany | N/A | N/A | N/A | P | 1-10 second |
Sample data set | Pecan Street | Austin, TX, USA | 7 days | 10 | 12 | S | 1 minute |
IHEPCDS | EDF R&D | France | 4 years | 1 | 3 | I, V, P, Q | 1 minute |
HES | UK DECC | UK | 1 month - 1 year | 250 | 13-51 | P | 2 minute |
AMPds | Simon Fraser University | Greater Vancouver, BC, Canada | 1 year | 1 | 19 | I, V, pf, F, P, Q, S | 1 minute |
N.B. I don't maintain this table of comparison, so please see this post for an up-to-date list!
Sunday, 16 June 2013
AMPds data set released
Stephen Makonin recently released the first version of the Almanac of Minutely Power Data set. The data set contains 1 minute aggregate meter readings as well as sub-metered readings from 19 individual circuits. Each reading includes measurements of voltage, current, frequency, power factor, real power, reactive power and apparent power. Furthermore, the aggregate gas and water consumption was also measured at 1 minute intervals, in addition to 1 individual usage for each utility. The data set spans an entire year from April 2012 to March 2013 from a single household in the greater Vancouver area, BC, Canada. The data set is available to anyone for free, although the authors require a username and password to be requested for the purposes of usage tracking.
The authors of the data set have described collection process in more detail in the accompanying paper, as well as showing benchmark results of a method based on independent time slice combinatorial optimisation:
Stephen Makonin, Fred Popowich, Lyn Bartram, Bob Gill, and Ivan V. Bajic, AMPds: A Public Dataset for Load Disaggregation and Eco-Feedback Research, in Electrical Power and Energy Conference (EPEC), The Annual, pp. 1-6, 2013.
The authors of the data set have described collection process in more detail in the accompanying paper, as well as showing benchmark results of a method based on independent time slice combinatorial optimisation:
Stephen Makonin, Fred Popowich, Lyn Bartram, Bob Gill, and Ivan V. Bajic, AMPds: A Public Dataset for Load Disaggregation and Eco-Feedback Research, in Electrical Power and Energy Conference (EPEC), The Annual, pp. 1-6, 2013.
Wednesday, 12 June 2013
Appliances are not as smart as you think
Designing suitable energy disaggregation algorithms requires a fair amount of knowledge about household appliances. It's important to understand which features make it possible to differentiate between appliances, such as average power demand or duration of use. However, collecting suitably varied appliance data takes a lot of time and resources, while using existing data sets often neglects the study of how the appliances were actually used. This post collects a few examples of appliances which have lead me to conclude that many appliances are really not that smart.
Refrigerator
Originally, I expected a fridge's power demand to vary depending on its temperature set point, i.e. the cooler you set the temperature, the more power required to maintain that temperature. In fact, a fridge's cooling motor is only turned on or off according to a thermostat, and as a result always draws the same level of power while it is on, and therefore spends a larger proportion of each day cooling as opposed to idle.
Kettle
Similar to the fridge, I expected the power demand of a kettle to vary depending on the amount of water in the kettle. However, the power demand is always constant while it's boiling, and it's actually the time taken to boil the water which increases with the volume of water.
Oven
Following the same trend, electric ovens are also thermostatically controlled. As a result, they draw the same power irrelevant of the temperature set point. This means the heating element spends more time heating for higher temperatures, and leaves longer gaps between heating periods for lower temperatures.
Others
Spotted the trend yet? Other examples in this category include air conditioners, microwaves, electric hobs and irons. This is in addition the even dumber category of completely manually controlled appliances, such a lamp or fan.
Exceptions
Unfortunately, all appliances aren't this dumb. Some appliances have a power demand that slowly ramps up or down at the start or end of its usage. One appliance which is quite the opposite of those described above is the plasma television. These appliances draw a power demand which is proportional to the brightness of the screen, with a fully white screen requiring the maximum power draw and a black screen requiring the minimum power draw. Such appliances are not only hard to disaggregate, but also significantly complicate the process of disaggregating other simpler appliances.
Thursday, 6 June 2013
Data set collected by UK EST, DECC and DEFRA
In 2012, the UK Energy Savings Trust, Department of Energy and Climate Change, and Department for Environment, Food and Rural Affairs published a 15 page report called Powering the Nation. This report summarises the full 600 page Household Electricity Use Study, which aimed to better understand how electricity is consumed in UK households. As part of this study, 251 owner-occupier households were monitored across England between April 2010 and April 2011. Of these households, 26 were monitored for 12 months, and 225 were monitored for 1 month. For each household, the energy consumption of 13-51 appliances was monitored at 2 minute intervals. A software portal is currently under development to provide access to the data set, although in the meantime the data can individually requested from ICF International by contacting efficient.products@icfi.com.
Wednesday, 29 May 2013
Data set released by EDF Energy
I've only just come across this data set, despite it being released almost a year ago! I've also updated my post of public data sets.
EDF Energy released a data set in 2012 containing energy measurements made at a single household in France for a duration of 4 years. Average measurements are available at 1 minute resolution of the household aggregate active power, reactive power, voltage and current, as well as the active power of 3 sub-metered circuits. Although each circuit contains a few appliances, this is the largest data set in terms of duration of measurement. The complete data set is openly available from the UCI Machine Learning Repository.
EDF Energy released a data set in 2012 containing energy measurements made at a single household in France for a duration of 4 years. Average measurements are available at 1 minute resolution of the household aggregate active power, reactive power, voltage and current, as well as the active power of 3 sub-metered circuits. Although each circuit contains a few appliances, this is the largest data set in terms of duration of measurement. The complete data set is openly available from the UCI Machine Learning Repository.
Saturday, 25 May 2013
The pros and cons of using HMMs to model appliances
In the last few years, hidden Markov models (HMMs) have become a very popular mathematical representation for appliances (Zia et al. 2010, Kim et al. 2011, Kolter et al 2012, Parson et al 2012). As a result, I'm often asked whether I think HMMs are the future of disaggregation. However, I'm yet to find an objective analysis of the advantages and disadvantages of such approaches, which is why I've done my best to list them here:
Advantages
- The HMM is a well studied probabilistic graphic model, for which algorithms are known for exact and approximate learning and inference
- HMMs are able to represent the variance of appliances' power demands through probability distributions
- HMMs capture the dependencies between consecutive measurements, as defined by Hart as the switch continuity principle
Disadvantages
- HMMs represent the behaviour of an appliance using a finite number of static distributions, and therefore fail to represent appliances with a continuously varying power demand
- Due to their Markovian nature, they do not take into account the sequence of states leading into any given state
- Again, due to their Markovian nature, the time spent in a given state is not captured explicitly. However, the hidden semi-Markov model does capture such behaviour
- Features other than the observed power demand are not captured (e.g. time of day). However, the input-output HMM allow such such state durations to be modelled
- Any dependency between appliances cannot be represented. However, the conditional-HMM can capture such dependencies
In summary, the basic HMM provides a useful model for many appliances. However, the appliances it can represent are limited by the intrinsic structure of the model. Many extensions exist that increase the representational power of the HMM, although the additional parameters required often complicate the learning and inference tasks.
Wednesday, 22 May 2013
AAAI 2012 Code Release
A while ago, I wrote a post stating that I was planning to release my NIALM code at the end of my PhD. I also mentioned in the post that I'd been happily giving out an archive of my code upon request. Since then, I've had far more requests than I'd expected, as well as quite a few technical questions regarding how to run it. As a result, I've decided to make my code from my AAAI 2012 paper available via my github for anyone to clone or contribute to.
The reason why I hadn't previously uploaded my code is that I simply do not have time to provide documentation or tutorials for using my code. Therefore, my code is provided "as is", so apologies in advance if you don't find it easy to use!
Update 15.09.2015: updated link to point to github
The reason why I hadn't previously uploaded my code is that I simply do not have time to provide documentation or tutorials for using my code. Therefore, my code is provided "as is", so apologies in advance if you don't find it easy to use!
Update 15.09.2015: updated link to point to github
Friday, 3 May 2013
Trip to the Minnesota, New York and North Carolina
I'll be visiting the US states of Minnesota, New York and North Carolina in the coming weeks, so please give me a shout if you're nearby and would like to talk disaggregation!
Thursday, 25 April 2013
DECC meeting on disaggregating UK smart meter data
Last week I attended an expert panel meeting for the Department of Energy and Climate Change to discuss how smart meter data could be used to better understand household energy use. The meeting was organised by Cambridge Architectural Research Ltd, and brought together a wide range of stakeholders from government, industry and academia. Among the many potential projects and barriers which were discussed, I've categorised what I believe to be the important facts:
UK smart meters will only automatically upload 30 min data for billing purposes. However, 10 second data will also be available to Consumer Access Devices (CAD), via short range WiFi. This creates two possibilities for disaggregation from 10 second data:
The most recent smart meter specification (SMETS v2 2013) states that only 10 second apparent power data will be available to CADs. However, it would theoretically be possible to increase the reporting rate up to 1 second data through a smart meter firmware upgrade. This rate of 1 report per second is the theoretical maximum rate of smart meters as a result of hardware limitations. Furthermore, current, voltage, harmonics, reactive power etc. will not be reported by smart meters at any sub-10 second rate.
Another topic discussed was the potential for a UK appliance database, similar to the Tracebase database, or a disaggregation test set, similar to the REDD data set. One potential source of data is the Powering the Nation database, which DECC/DEFRA plan to release in the near future. The study collected data from 250 homes which were monitored for either 1 month or 1 year to investigate domestic energy consumption habits.
Data availability
UK smart meters will only automatically upload 30 min data for billing purposes. However, 10 second data will also be available to Consumer Access Devices (CAD), via short range WiFi. This creates two possibilities for disaggregation from 10 second data:
- A disaggregation system could be installed in each household as a CAD
- A CAD could upload data to cloud storage via the home broadband connection
The second option seems the most realistic to me, given the intrinsic opt-in nature of disaggregation and the benefits of performing disaggregation in the cloud.
Data granularity
The most recent smart meter specification (SMETS v2 2013) states that only 10 second apparent power data will be available to CADs. However, it would theoretically be possible to increase the reporting rate up to 1 second data through a smart meter firmware upgrade. This rate of 1 report per second is the theoretical maximum rate of smart meters as a result of hardware limitations. Furthermore, current, voltage, harmonics, reactive power etc. will not be reported by smart meters at any sub-10 second rate.
Appliance database
Another topic discussed was the potential for a UK appliance database, similar to the Tracebase database, or a disaggregation test set, similar to the REDD data set. One potential source of data is the Powering the Nation database, which DECC/DEFRA plan to release in the near future. The study collected data from 250 homes which were monitored for either 1 month or 1 year to investigate domestic energy consumption habits.
Friday, 19 April 2013
New data set released by Pecan Street Research Institute
Pecan Street Research Institute recently announced the release of a new data set designed specifically to enable the evaluation of electricity disaggregation technology. A free sample data set is available to members of its research consortium, which has now been opened up to university researchers. The sample data set contains 7 days of data from 10 houses in Austin, TX, USA, for which both aggregate and circuit data is also available containing power readings at 1 minute intervals. In addition to common household loads, 2 of the houses also have photovoltaic systems and 1 house also has an electric vehicle.
Wednesday, 17 April 2013
NIALM helpful terminology
When discussing related research, most papers group existing disaggregation approaches into distinct categories. As a result, many taxonomies have emerged, and unfortunately they are not always well defined before they are used. I therefore decided to compile the following list of the terminology which I've seen in recent years:
Intrusive vs non-intrusive monitoring
- Intrusive metering refers to the deployment of one monitor per appliance. This is clearly intrusive since it requires access to each appliance to install such equipment. This has the benefit in that the only uncertainty in such monitoring is due to inaccuracies in the metering hardware.
- Non-intrusive metering refers to the deployment of one (or sometimes two) meters per household. This is clearly less intrusive, since it does not provide any inconvenience besides the installation of government mandated smart meters. However, this has the disadvantage that the disaggregation process is likely to introduce further inaccuracies.
Supervised vs unsupervised training
- Supervised training (a.k.a manual setup) refers to performing disaggregation with the aid of labelled appliance data, generally from the same home in which disaggregation is performed. The training data normally consists of sub-metered appliance power data, or a phase in which appliances are turned on one by one, and labelled manually.
- Unsupervised training (a.k.a automatic setup) refers to performing disaggregation without any training data from the household in which disaggregation is being performed. However, without any notion of what appliances exist or how they behave, at best a system can only identify distinct appliances (e.g. appliance 1, appliance 2), and cannot label them with an appliance name (e.g. refrigerator or washing machine).
Event-based vs non event-based disaggregation
- Event-based disaggregation refers to methods which have distinct event detection (e.g. something switched on at 12pm) and event classification (e.g. it was the washing machine). These approaches are often identifiable by a sequential pipeline of algorithms (data collection -> smoothing -> edge detection -> classification). A core advantage of event-based approaches is that decisions are made sequentially, and therefore can easily be deployed as a real-time system.
- Non event-based disaggregation refers to methods which combine event detection and classification into a single process, in which both are inferred simultaneously. These are often identifiable by their use of time series models, which are able to reason over a sequence of data. The advantage of non event-based approaches is that high confidence decisions can affect those that are likely to surround it (e.g. a refrigerator is likely to turn on 30 minutes after its last cycle ended).
High frequency vs low frequency sampling
- High frequency sampling generally refers to meters which sample the current and voltage of a wire at a rate in the order of thousands of times per second (kHz). At this rate, information such as reactive power and current harmonics can be calculated, which are useful features for classification. However, few smart meters are likely to report data at this granularity.
- Low frequency sampling generally refers to meters which sample at between once per second and once per hour. When reported at this rate, active power is indistinguishable from reactive power, and no harmonic content is available. This is the reporting rate of most smart meters.
Steady-state vs transient-state analysis
- Steady-state analysis divides a power series into periods of constant power during which no appliances change state. The differences between these levels of constant power are then used to infer which state change(s) had taken place.
- Transient-state analysis uses the patterns between steady states to classify appliance state changes. However, it is necessary to sample at a high frequency in order to extract transient features for most appliances.
As always, please leave a comment if you come across any terminology you think should be in this list.
Thursday, 11 April 2013
Is REDD representative of actual electricity data?
Since its release in 2011, the REDD data set has revolutionised energy disaggregation research. To date, it's been cited 33 times in under two years, and has quickly become the standard data set upon which new approaches are benchmarked. As such, it's now hard to believe any results which are tested on simulated data or a proprietary data set instead.
However, I've often heard people mention the data is unrepresentative. Here are some of the reasons:
However, I've often heard people mention the data is unrepresentative. Here are some of the reasons:
- Location
- Since the data was collected only from households in the Massachusetts area, there's very little variation due to location. As a result, it's a little hard to justify that results based on the REDD data set generalise to other countries or continents. In fact, I even wrote a blog post on how different the disaggregation problem is between the US and the UK.
- Accuracy
- The REDD data set contains a huge amount of data describing the household aggregate power demand. Even with the low resolution data, in which the power is down-sampled to one reading per second, the accuracy is still far greater than that of many off-the-shelf electricity monitors. Clearly high accuracy isn't a bad thing, but I doubt many researchers will voluntarily add noise to this data.
- Reliability
- A common problem with off-the-shelf electricity monitors is their unreliability. Building a disaggregation system that is robust to such random missing readings can be a real challenge. Although this is not a characteristic the REDD data set suffers from, it does contain some long gaps, as I reported in a post on REDD statistics.
- Circuits
- Despite the description in the SustKDD paper, the data set only contains household and circuit level recordings, and not plug level data. As a result, it's only possible to test the disaggregation of appliances which appear on their own on a circuit, and are labelled as such.
Despite these issues, REDD is by far the most comprehensive data set for evaluating non-intrusive load monitoring systems. However, there must be other data sets that can offer insight for energy disaggregation systems, even if they only contain aggregate data. If you know of any such data sets, I'd be very interested to hear from you!
Friday, 5 April 2013
Why NIALM shouldn't be modelled as the knapsack or subset-sum problem
In his seminal work, Hart (1992) highlighted the similarities between the appliance disaggregation problem and some well-studied combinatorial optimisation problems. He stated that if the power demand of each appliance is known, the disaggregation problem can be modelled as an instance of the subset sum problem. This approach selects the set of appliances whose power demands sum to the household aggregate power, applied independently to each aggregate power reading. However, Hart identified a core problem of this approach:
- Small fluctuations in the aggregate power lead to solutions with unrealistically high numbers of appliance switch events.
More recently Egarter et al. (2013) extended this model, stating that if both the power demand and duration of usage of each appliance are known, the disaggregation problem can be modelled as a two-dimensional knapsack problem. This approach is similar to Hart's proposed method, with the exception that each aggregate measurement is not considered as an individual optimisation problem. Instead, appliance operations must persist for the exact given duration. However, I see two major problems with this approach:
- Optimal solutions might give unrealistic results. As the baseload power increases, I'd expect the number of solutions (those that produce a small difference between the sum of appliance power and the aggregate power) to increase rapidly. As a result, I seriously doubt that the actual optimal solution would give the best disaggregation accuracy.
- Appliance durations are often highly variable. Unlike power demand, their duration of usage is often determined by the user. However, extending this formalism such that appliances could have a variable duration would hugely increase the size of the solution space.
Although these approaches show promise in simple scenarios, I don't believe they capture the flexibility required by the energy disaggregation scenario. Instead, I believe probabilistic approaches are more appropriate, given their ability to reason over different solutions by estimating the likelihood of different event sequences.
Tuesday, 2 April 2013
Paper accepted at IJCAI on Biodiversity Monitoring
We recently had a paper accepted at the International Joint Conference on Artificial Intelligence 2013 based on the detection of the New Forest Cicada from audio recordings.
Abstract:
Automated acoustic recognition of species aims to provide a cost-effective method for biodiversity monitoring. This is particularly appealing for detecting endangered animals with a distinctive call, such as the New Forest cicada. To this end, we pursue a crowdsourcing approach, whereby the millions of visitors to the New Forest will help to monitor the presence of this cicada by means of a smartphone app that can detect its mating call. However, current systems for acoustic insect classification are aimed at batch processing and not suited to a real-time approach as required by this system, because they are too computationally expensive and not robust to environmental noise. To address this shortcoming we propose a novel insect detection algorithm based on a hidden Markov model to which we feed as a single feature vector the ratio of two key frequencies extracted through the Goertzel algorithm. Our results show that this novel approach, compared to the state of the art for batch insect classification, is much more robust to noise while also reducing the computational cost.
Reference:
Davide Zilli, Oliver Parson, Geoff V Merrett, Alex Rogers. A Hidden Markov Model-Based Acoustic Cicada Detector for Crowdsourced Smartphone Biodiversity Monitoring. In: 23rd International Joint Conference on Artificial Intelligence. Beijing, China. 2013.
Abstract:
Automated acoustic recognition of species aims to provide a cost-effective method for biodiversity monitoring. This is particularly appealing for detecting endangered animals with a distinctive call, such as the New Forest cicada. To this end, we pursue a crowdsourcing approach, whereby the millions of visitors to the New Forest will help to monitor the presence of this cicada by means of a smartphone app that can detect its mating call. However, current systems for acoustic insect classification are aimed at batch processing and not suited to a real-time approach as required by this system, because they are too computationally expensive and not robust to environmental noise. To address this shortcoming we propose a novel insect detection algorithm based on a hidden Markov model to which we feed as a single feature vector the ratio of two key frequencies extracted through the Goertzel algorithm. Our results show that this novel approach, compared to the state of the art for batch insect classification, is much more robust to noise while also reducing the computational cost.
Reference:
Davide Zilli, Oliver Parson, Geoff V Merrett, Alex Rogers. A Hidden Markov Model-Based Acoustic Cicada Detector for Crowdsourced Smartphone Biodiversity Monitoring. In: 23rd International Joint Conference on Artificial Intelligence. Beijing, China. 2013.
Friday, 29 March 2013
Releasing code used in academic publications
In academia, it's good scientific practice to release code after producing a publication. This allows other researchers to replicate the publication's findings, benchmark against the published approach, and even extend the work in new directions. However, it's actually quite rare to find a given paper's code online, and even more so to find the documentation required to make use of the code. Although having said this, many academics are happy to provide their code when requested, especially if they're no longer pursuing that field of research.
Unfortunately, I seem to have fallen into the habit of providing my code upon request, instead of releasing it online. I guess the reason for this is that the task of releasing code always seems to be superseded by other upcoming deadlines, and consequently remains at the bottom of my to do list. As a result, I end up giving out an undocumented archive of my code, which I can't believe is particularly useful to many people.
Therefore, I've decided to release all my code at the end of my PhD. I'm hoping that after submitting my thesis, I'll have a window of time to tidy up these kind of loose ends. However, if I get to the end of the year (2013) without releasing anything, please remind me of this post!
Unfortunately, I seem to have fallen into the habit of providing my code upon request, instead of releasing it online. I guess the reason for this is that the task of releasing code always seems to be superseded by other upcoming deadlines, and consequently remains at the bottom of my to do list. As a result, I end up giving out an undocumented archive of my code, which I can't believe is particularly useful to many people.
Therefore, I've decided to release all my code at the end of my PhD. I'm hoping that after submitting my thesis, I'll have a window of time to tidy up these kind of loose ends. However, if I get to the end of the year (2013) without releasing anything, please remind me of this post!
Monday, 25 March 2013
Is energy disaggregation a solved problem?
Every week I seem to read about a new start-up company, or a new academic paper, or a new patent, which claims to solve the problem of energy disaggregation. This is probably what makes it so exciting to be working in this field, given its recent explosion in size and attraction of much funding across the world. However, as a lonely PhD student, it's also a little intimidating to try to make an impact in this increasingly crowded field. I therefore decided to write this post which will hopefully persuade others that there are still a lot more problems just waiting to be tackled.
The best way to decide whether the problem has been solved is clearly to first unravel what we mean by the term solved. The two most important factors in my opinion have got to be the scenario and and the accuracy, which I've picked apart below.
The most compelling scenarios are the ones in which the software does as much of the hard work as possible, allowing the hardware to remain simple and inexpensive. Furthermore, the installation process should also be free from any dedicated training phase in which the temporary control of appliances is required or additional monitoring equipment is deployed. This style of approach clearly affords the maximum scalability, which is essential if disaggregated data is ever to reach the masses. While approaches are still being proposed that don't address this scenario, there's still quite a way to go yet.
It's very easy to get carried away when reading papers how one approach has achieved X% accuracy, while another achieved Y% accuracy. However, I'm fairly sure that aiming for 100% disaggregation accuracy is not only unrealistic, but also unnecessary too. I think it's far more important that energy disaggregation is used as a platform from which personalised, actionable, energy-saving suggestions can be derived. Therefore, disaggregation only needs to be accurate enough to convincingly determine the bigger picture, and any effort beyond this is likely to be a waste of time.
On a final note, even if there exists an approach that's managed to address the above scenario to a suitable level of accuracy, this is still only only piece of the disaggregation puzzle. The diversity of electricity consumption means that different countries and buildings require different approaches, and there will always be a niche sub-area of disaggregation just waiting to be found.
The best way to decide whether the problem has been solved is clearly to first unravel what we mean by the term solved. The two most important factors in my opinion have got to be the scenario and and the accuracy, which I've picked apart below.
Scenario
The most compelling scenarios are the ones in which the software does as much of the hard work as possible, allowing the hardware to remain simple and inexpensive. Furthermore, the installation process should also be free from any dedicated training phase in which the temporary control of appliances is required or additional monitoring equipment is deployed. This style of approach clearly affords the maximum scalability, which is essential if disaggregated data is ever to reach the masses. While approaches are still being proposed that don't address this scenario, there's still quite a way to go yet.
Accuracy
It's very easy to get carried away when reading papers how one approach has achieved X% accuracy, while another achieved Y% accuracy. However, I'm fairly sure that aiming for 100% disaggregation accuracy is not only unrealistic, but also unnecessary too. I think it's far more important that energy disaggregation is used as a platform from which personalised, actionable, energy-saving suggestions can be derived. Therefore, disaggregation only needs to be accurate enough to convincingly determine the bigger picture, and any effort beyond this is likely to be a waste of time.
On a final note, even if there exists an approach that's managed to address the above scenario to a suitable level of accuracy, this is still only only piece of the disaggregation puzzle. The diversity of electricity consumption means that different countries and buildings require different approaches, and there will always be a niche sub-area of disaggregation just waiting to be found.
Wednesday, 13 March 2013
NIALM companies in France
I've recently come across two (fairly) new French companies working on energy disaggregation, so I thought I'd share a short summary of each. I've also updated my post on NIALM in industry.
Fludia is a French company specialising in energy management, who aim to provide their customers with technology to increase the energy efficiency of their homes. They have developed a device to retrofit non-smart meters, called Fludiameter, such that 1 minute resolution energy data can be collected without installing a whole new meter. Fludia also provide a tool to break down electricity consumption into its end uses, called Beluso, which makes use of household aggregate data and also information entered from a household survey.
Wattseeker offer a datalogger, which includes a number of current clamps and the ability to upload data via 3G, Ethernet or Wi-Fi. These current clamps sample the current and voltage at a kHz rate since real and reactive power are reported, along with harmonics etc. However, the installation does require a short shut down of the building's power. Their disaggregation system, LYNX, then disaggregates the electricity consumption to provide actionable energy saving suggestions. Their website indicates that each current clamp can disaggregate up to 12 appliances, with an accuracy of +/- 2%.
Thanks to Jack Kelly for the link to Fludia!
Fludia, Paris, France
Fludia is a French company specialising in energy management, who aim to provide their customers with technology to increase the energy efficiency of their homes. They have developed a device to retrofit non-smart meters, called Fludiameter, such that 1 minute resolution energy data can be collected without installing a whole new meter. Fludia also provide a tool to break down electricity consumption into its end uses, called Beluso, which makes use of household aggregate data and also information entered from a household survey.
Wattseeker, Nice, France
Wattseeker offer a datalogger, which includes a number of current clamps and the ability to upload data via 3G, Ethernet or Wi-Fi. These current clamps sample the current and voltage at a kHz rate since real and reactive power are reported, along with harmonics etc. However, the installation does require a short shut down of the building's power. Their disaggregation system, LYNX, then disaggregates the electricity consumption to provide actionable energy saving suggestions. Their website indicates that each current clamp can disaggregate up to 12 appliances, with an accuracy of +/- 2%.
Thanks to Jack Kelly for the link to Fludia!
Sunday, 10 March 2013
Differences between disaggregation in the UK and US
I was asked recently what I thought the differences were between disaggregation in UK households in comparison to those in the US. My first reaction was that a UK household is just a simplification of a US household, but I've been thinking it over since and come up with quite a few key differences. This is my attempt to categorise them:
There are quite a few differences in the loads that can be found in American households compared to their British counterparts. The most significant of which is the presence of heating, ventilation and air-conditioning (HVAC) systems in American households. In comparison, UK households are far more likely to be heated by gas, and not require air-conditioning at all. Since the HVAC system often constitutes the largest electrical load in American households, the problem of disaggregating the remaining appliances is clearly simpler for UK households. Furthermore, in my experience, American households contain not only a wider range of loads (pool pumps, etc.) but also contain more duplicate appliances (2 or 3 fridge/freezers, etc.). This again contributes to a harder disaggregation problem for US households.
American and British households also vary in the way that electricity is supplied to the properties. UK households typically receive single-phase (1 electricity input wire) power at 230 volts, while US households receive split-phase (2 electricity input wires) power at 120 volts. The case of split-phase power provides a convenient opportunity to install two current clamps instead of one. Through this small additional installation cost, the complexity of the disaggregation is more than halved, since most appliances are connected to only one of the input cables.
To the best of my knowledge, the capabilities of smart meters in both countries are yet to be finalised. However, the UK provides an interesting extension to the government mandated smart meter rollout, in which households will also receive an in-home display (IHD). IHDs will primarily provide a household's occupants with real-time information, such as their current power demand and cost of electricity. However, since these devices are likely to have access to electricity data at a higher granularity than is transmitted to the energy provider, they provide a convenient platform on which disaggregation can be performed. Furthermore, performing on higher granularity data within each household even circumvents any privacy concerns related to transmitting private data outside of the home. However, IHDs will clearly have limited resources in terms of processing power etc., which raises the interesting field of resource constrained energy disaggregation.
Load diversity
There are quite a few differences in the loads that can be found in American households compared to their British counterparts. The most significant of which is the presence of heating, ventilation and air-conditioning (HVAC) systems in American households. In comparison, UK households are far more likely to be heated by gas, and not require air-conditioning at all. Since the HVAC system often constitutes the largest electrical load in American households, the problem of disaggregating the remaining appliances is clearly simpler for UK households. Furthermore, in my experience, American households contain not only a wider range of loads (pool pumps, etc.) but also contain more duplicate appliances (2 or 3 fridge/freezers, etc.). This again contributes to a harder disaggregation problem for US households.
Split-phase power
American and British households also vary in the way that electricity is supplied to the properties. UK households typically receive single-phase (1 electricity input wire) power at 230 volts, while US households receive split-phase (2 electricity input wires) power at 120 volts. The case of split-phase power provides a convenient opportunity to install two current clamps instead of one. Through this small additional installation cost, the complexity of the disaggregation is more than halved, since most appliances are connected to only one of the input cables.
Smart meters
To the best of my knowledge, the capabilities of smart meters in both countries are yet to be finalised. However, the UK provides an interesting extension to the government mandated smart meter rollout, in which households will also receive an in-home display (IHD). IHDs will primarily provide a household's occupants with real-time information, such as their current power demand and cost of electricity. However, since these devices are likely to have access to electricity data at a higher granularity than is transmitted to the energy provider, they provide a convenient platform on which disaggregation can be performed. Furthermore, performing on higher granularity data within each household even circumvents any privacy concerns related to transmitting private data outside of the home. However, IHDs will clearly have limited resources in terms of processing power etc., which raises the interesting field of resource constrained energy disaggregation.
Tuesday, 5 March 2013
Postdoc position through Doctoral Prize award
Recently I've been thinking a lot about my career beyond my PhD, for which the funding runs out at the end of this year. Having experienced energy disaggregation from both sides of the academia/industry fence, I've learned a lot about both getting algorithms to scale to huge amounts of data and presenting my work at international venues. However, as I entered the final year of my PhD, I was enjoying my PhD so much that I decided to apply for the Doctoral Prize award; a grant for a one year postdoc position designed specifically to increase the impact of technology developed during a PhD.
Today I'm excited to announce that my application has been accepted, and assuming that all goes to plan while writing up my thesis, I'll start work as a postdoc in the new year. I'm hoping this position will give me the opportunity to put my PhD work into the hands of real users and show that energy disaggregation really can provide actionable suggestions and lead to real energy savings. I'm excited about the new challenges this will bring and the insight it will provide to this growing community.
Of course this also means that I'll continue to blog about energy disaggregation for at least another year.
Today I'm excited to announce that my application has been accepted, and assuming that all goes to plan while writing up my thesis, I'll start work as a postdoc in the new year. I'm hoping this position will give me the opportunity to put my PhD work into the hands of real users and show that energy disaggregation really can provide actionable suggestions and lead to real energy savings. I'm excited about the new challenges this will bring and the insight it will provide to this growing community.
Of course this also means that I'll continue to blog about energy disaggregation for at least another year.
Tuesday, 19 February 2013
NIALM and limited sub-metering
I've read a couple of papers recently about using minimal sub-metering to enhance the accuracy of NIALM. They are:
However, I can't help but feel that any intrusion beyond the installation of a household-level monitor prevents the scalability of this technology. I accept that 100% disaggregation accuracy for a previously unseen household with a single aggregate meter is likely be to impossible. However, I find software solutions far more compelling than hardware solutions, due to their ability to scale to any data set containing a single stream of aggregate readings. Furthermore, I think the state of the art in unsupervised NIALM is still far from the best that can be achieved, and there are still many algorithmic improvements to be made.
- Yongcai Wang, Xiaohong Hao, Lei Song, Chenye Wu, Yuexuan Wang, Changjian Hu, and Lu Yu. Tracking states of massive electrical appliances by lightweight metering and sequence decoding. In Proceedings of the Sixth International Workshop on Knowledge Discovery from Sensor Data (SensorKDD '12). ACM, New York, NY, USA, 34-42, 2012.
- Nambi, S. N., Akshay, U., Papaioannou, Thanasis G., Chakraborty, Dipanjan, Aberer, Karl. Sustainable Energy Consumption Monitoring in Residential Settings. 2nd IEEE INFOCOM Workshop on Communications and Control for Smart Energy Systems, Turin, Italy, April 14-19, 2013.
However, I can't help but feel that any intrusion beyond the installation of a household-level monitor prevents the scalability of this technology. I accept that 100% disaggregation accuracy for a previously unseen household with a single aggregate meter is likely be to impossible. However, I find software solutions far more compelling than hardware solutions, due to their ability to scale to any data set containing a single stream of aggregate readings. Furthermore, I think the state of the art in unsupervised NIALM is still far from the best that can be achieved, and there are still many algorithmic improvements to be made.
Tuesday, 12 February 2013
Residential vs Commercial/Industrial NIALM
I was recently asked how energy disaggregation in the residential sector compares to the commercial and industrial sectors, since the majority of academic research focuses on domestic energy consumption. This got me thinking about the major differences, and why academic research focuses on disaggregating household appliances, so I decided to collect my thoughts here.
As a general rule of thumb, I think of a commercial premises (e.g. a shop, office or restaurant) as consuming roughly an order of magnitude more energy than an average household. Clearly there's a huge variability here, but I think it's a fair assumption given that, in general, there are more people spending more time in these premises. I think industrial premises are bound to consume even more energy than commercial premises, given that a single type of machinery could quickly consume more energy than a household or shop could in a day. This would surely make energy disaggregation more attractive in these settings, given the greater potential for energy savings.
I'd expect commercial and industrial premises to contain many more electrical loads compared to a household. As a result, it's likely that more loads will be run simultaneously, and also more duplicate appliances will exist within a single premises. This clearly increases the complexity of the disaggregation problem. However, I'd also expect these types of premises to show a higher correlation over daily and weekly cycles, and further similarities are likely to exist between chains of commercial premises. These differences clearly change the focus of the disaggregation problem, and would likely require different approaches to feature selection and parameter learning.
Total energy consumption
Variation of loads
Cost/benefit ratio
Given the greater potential for energy savings, the scenario becomes quite different to that of household energy disaggregation. There's clearly an argument for greater investment in monitoring equipment, such as higher frequency sampling or sub-metering. It might even be financially viable to monitor each load individually in an industrial environment, since the cost of the energy monitor is much lower in comparison to the size of the load. This begs the question of whether NIALM is even necessary in such situations.
Motivation and access
In my opinion, the domestic scenario is made far more compelling by the smart meter deployments mandated by many governments around the world. This means that household electricity data will soon be available at huge scales, and therefore there will be huge potential for advances in disaggregation algorithms. Conversely, commercial and industrial premises owners probably care relatively little about the electricity costs, as a result of the small cost to their business compared to other factors. I think these points help to explain why academic research has so far been concentrated around energy disaggregation in residential settings.
Friday, 1 February 2013
NIALM interest worldwide
Without attending an international conference, it can be hard to get an idea for how localised the interest in a technology might be. To this end, I've shared the following map of the visits to this blog over the last two years:
where dark green countries represent the most hits and white countries represent no hits.
The two darkest countries are the United States and the United Kingdom by a long a way, which doesn't particularly surprise me. The next band of interest seems to be from western Europe, Canada, Russia, India and Australia. I think this represents the nationalities of most of the people I've met and that of authors of papers I've read fairly well.
I know that colouring countries isn't an ideal representation of interest over a geographical area, but I didn't want this to turn into an exercise in statistics.
where dark green countries represent the most hits and white countries represent no hits.
The two darkest countries are the United States and the United Kingdom by a long a way, which doesn't particularly surprise me. The next band of interest seems to be from western Europe, Canada, Russia, India and Australia. I think this represents the nationalities of most of the people I've met and that of authors of papers I've read fairly well.
I know that colouring countries isn't an ideal representation of interest over a geographical area, but I didn't want this to turn into an exercise in statistics.
Monday, 21 January 2013
NIALM papers at IECON 2012
The 38th Annual Conference of the IEEE Industrial Electronics Society was held in Montréal, Canada, a few months ago and I've only just had the chance to go through the proceedings. There were quite a few papers related to NIALM, so I thought I'd list, link and summarise them here:
- Anderson, J., Sadhanala, A., & Cox, R. (2012). Using smart meters for load monitoring and active power-factor correction. IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society (pp. 4872–4876)
- This paper describes a novel application for how NIALM can be used to detect reactive loads for active power-factor correction. The proposed approach uses a NIALM system to detect large reactive loads (e.g. air conditioning unit), and issue a control signal to the active power filter. The filter compensates for the load by reducing the reactive power to 0, therefore correcting the aggregate power factor. The NIALM approach utilises steady and transient-state features which extracted from the real power, reactive power and spectral envelopes, which in turn are calculated from high frequency measurements of aggregate current and voltage.
- Anderson, K. D., Berges, M. E., Ocneanu, A., Benitez, D., & Moura, J. M. F. (2012). Event detection for Non Intrusive load monitoring. IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society (pp. 3312–3317)
- This paper focuses on the problem of event detection; detecting the times at which one appliance (or more) has turned on or off. The authors start by summarising state-of-the-art event detection algorithms and metrics. They then give a comparison of the performance of an event detection method based on the generalised likelihood ratio with its parameters optimised by a number of different metrics. The authors conclude that a metric based on the total power changes yields the best performance, and attribute its high performance to its weighting of the important of appliances by their energy consumption.
- Bier, T., Abdeslam, D. O., Merckle, J., & Benyoucef, D. (2012). Smart meter systems detection & classification using artificial neural networks. IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society (pp. 3324–3329)
- This paper describes a method for disaggregating the refrigerator from a household's aggregate load using artificial neural networks. The authors use a similar training method to my AAAI paper, in which the refrigerator model is trained during the overnight period. To detect appliance switch events, the authors adopt the method proposed by Hart with slightly tuned parameters. The performance of the approach is demonstrated using 50 days of data collected from a single household, and the accuracy of turn-on classifications is compared to the method proposed by Hart.
- Du, L., Yang, Y., He, D., Harley, R. G., Habetler, T. G., & Lu, B. (2012). Support vector machine based methods for non-intrusive identification of miscellaneous electric loads. IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society (pp. 4866–4871)
- The contribution of this paper is to the problem of appliance identification (classify the type of a single appliance given its power demand) as opposed to disaggregation (determine the power demand of each appliance type given the aggregate power demand). The authors collect high frequency samples (kHz) of each load's current and voltage, from which a number of features corresponding to electrical characteristics are extracted. The authors propose the use of a hybrid supervised self-organising map and support vector machine approach to classify loads, and show that the hybrid outperforms both approaches individually.
- Sinha, M., Desai, B., & Cox, R. (2012). Using smart meters for diagnostics and model-based control in thermal comfort systems. IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society (pp. 3600–3605)
- The authors contribute a method for the thermal modelling of buildings using a combination of NIALM and additional sensor measurements. Such modelling allows both HVAC problem diagnosis and the formulation of schedules for HVAC systems with variable speed compressors. The approach learns the thermal properties of a building by using the HVAC operating schedule (as determined by the NIALM from smart meter data), the external temperature, the internal temperature and solar irradiance (as measured by individual sensors). The authors test their diagnostic system by removing some coolant from the HVAC system and observing the change in performance, and also show the performance of their predictive control model for variable speed HVAC systems.
Tuesday, 15 January 2013
2 Papers accepted at AAMAS and IUI on AgentSwitch
As part of the ORCHID project, we recently had two papers accepted at international conferences based on AgentSwitch; a domestic energy recommendation platform. AgentSwitch utilises electricity usage data collected from households over a period of time, to realise a range of smart energy-related recommendations on energy tariffs, load detection and usage shifting. I've been responsible for the load detection module of AgentSwitch over the past few months, and am looking forward to improving it over the final year of my PhD. The following two papers provide two different perspectives on the project. The first gives the algorithmic detail and accuracy evaluations of the individual system modules. The second describes a user evaluation of AgentSwitch, which reveals the strengths and weaknesses of the system as an energy-related recommender system.
It seems that papers follow the same rule as British buses; you wait ages for one and then two arrive at the same time.
- Sarvapali Ramchurn, Michael Osborne, Oliver Parson, Sasan Maleki, Talal Rahwan, Trung Dong Huynh, Steve Reece, Muddasser Alam, Joel Fischer, Greg Hines, Enrico Costanza, Luc Moreau, Tom Rodden. AgentSwitch: Towards Smart Energy Tariff Selection. In: 12th International Conference on Autonomous Agents and Multi-Agent Systems. Saint Paul, Minnesota, USA. 2013.
- Joel Fischer, Sarvapali Ramchurn, Michael Osborne, Oliver Parson, Trung Dong Huynh, Muddasser Alam, Nadia Pantidi, Stuart Moran, Kaled Bachour, Steve Reece, Enrico Costanza, Tom Rodden, Nicholas Jennings. Recommending Energy Tariffs and Load Shifting Based on Smart Household Usage Profiling. In: International Conference on Intelligent User Interfaces. Santa Monica, CA, USA. 2013.
It seems that papers follow the same rule as British buses; you wait ages for one and then two arrive at the same time.
Monday, 7 January 2013
Top papers of 2012 for Non-Intrusive Appliance Load Monitoring (NIALM)
A little over a year ago I posted a list of my top 10 papers on non-intrusive appliance load monitoring. It quickly became my blog's most popular post, and also inspired more comments than any other post. Collecting and summarising academic literature is clearly useful to the community, and as a result I decided to collect the papers published during 2012 that I found most useful. There were many more papers published than I have time to describe here, but a more comprehensive list can be found in my on-line Mendeley reference library. I hope you find this list useful, and as always feel free to leave a comment!
- Armel, K. C., Gupta, A., Shrimali, G., & Albert, A. (2013). Is disaggregation the holy grail of energy efficiency? The case of electricity. Energy Policy, 52, 213–234.
- The authors provide an argument for applying disaggregation algorithms to smart meter data. They focus on many practical problems which are often ignored in academia, such as data availability, transmission capabilities and deployment costs. The authors conclude with a set of recommendations for both disaggregation algorithms and smart meter deployments, which if followed will aid the deployment of such technology at national scales.
- Kolter, J. Z., & Jaakkola, T. (2012). Approximate Inference in Additive Factorial HMMs with Application to Energy Disaggregation. International Conference on Artificial Intelligence and Statistics (pp. 1472–1482). La Palma, Canary Islands.
- This paper describes an algorithm for efficiently disaggregating appliances by modelling the problem as a factorial hidden Markov model. In such a model, sudden increases or decreases in meter measurements are used to identify appliances turning on or off. The authors extend the model to include an additional component which means that not all appliances within the household are required to be modelled. The proposed model and inference algorithm are evaluated using both simulated data and the REDD data set.
- Parson, O., Ghosh, S., Weal, M., & Rogers, A. (2012). Non-intrusive load monitoring using prior models of general appliance types. Twenty-Sixth Conference on Artificial Intelligence (AAAI-12).
- This paper contributes an unsupervised training method for NIALM systems. The approach uses prior appliance models which describe generalisable appliance behaviour (e.g. behaviour of all refrigerators), which are tuned to match a specific appliance instance (e.g. one particular refrigerator) using only aggregate data. The approach is benchmarked against two different training methods: a variant in which the prior models are not tuned, and a variant in which the prior models are tuned using sub-metered data.
- Reinhardt, A., Baumann, P., Burgstahler, D., Hollick, M., Chonov, H., Werner, M., & Steinmetz, R. (2012). On the Accuracy of Appliance Identification Based on Distributed Load Metering Data. 2nd IFIP Conference on Sustainable Internet and ICT for Sustainability (SustainIT).
- The core contribution of this paper is an approach to solving the appliance identification problem. However, I've included it here since I believe the tracebase data set released with this paper is highly relevant to the NIALM community. To the best of my knowledge, the data set contains the largest public collection of appliance power data, as described in my previous blog post. As a result, it provides the potential for households to be simulated by summing arbitrary combinations of actual appliance loads to produce artificial aggregate loads.
- Wang, Y., Hao, X., Song, L., Wu, C., Wang, Y., Hu, C., & Yu, L. (2012). Tracking states of massive electrical appliances by lightweight metering and sequence decoding. Proceedings of the Sixth International Workshop on Knowledge Discovery from Sensor Data (pp. 34–42). New York, NY, USA.
- The authors of this paper address two problems in NIALM. First, they present an algorithm to perform efficient inference in factorial hidden Markov models for appliance disaggregation by forgetting unlikely state transitions. Second, they present an approach to determine the number and positions of additional circuit-level meters so as to ensure a minimum accuracy of disaggregation. They evaluate their approaches using both simulated data and data from Stanford's Powernet data set.
- Zoha, A., Gluhak, A., Imran, M. A., & Rajasegarar, S. (2012). Non-Intrusive Load Monitoring Approaches for Disaggregated Energy Sensing: A Survey. Sensors, 12, 16838–16866.
- This overview paper gives a description of the current state of the art in academia. In addition, a list of accuracy metrics and publicly available data sets is given. The authors highlight some limitations of the field, such as the invasiveness of manual training processes. They conclude with a set of directions to advance the field, including a suggestion to replace the manual construction of appliance databases with unsupervised training methods.