Monday, 10 April 2017

Jack's NILM Competition Survey

Jack Kelly recently published the results of a survey which he designed to assess the appetite for a NILM competition. The survey covers a range of topics, from technical questions regarding the sample rate and required features, to practicalities such as where algorithms should run and how often the competition should take place. However, the survey highlights two key issues that make the design of such a competition quite tricky:

Data - As Jack explained in a recent blog post, the collection of a large enough data set is expensive and there is no clear business case for a single organisation to pick up the cost. This is due to the sheer number of sensors per house, duration of data collection and number of houses required. Furthermore, the data set cannot be reused once the data set has been released or even once the accuracy of successive runs of an algorithm have been made available.

Requirements - Almost by definition, every disaggregation company and researcher is tackling the problem from a slightly different angle. Some use a unique sensor, while others require a unique training procedure. Even beyond the differences in disaggregation solutions, there's no clear consensus on issues such as where the competition should run or what training data should be provided.

However, all hope is not lost, given Pecan Street's demonstration that collecting sub-metered data at scale is possible, and also the precedent set by Belkin's competition back in 2013. Furthermore, most participants agreed on a few issues, such as 1 Hz active power being a reasonable place to start.

Jack's post-doc has now come to an end, which means he won't continue working towards running such a competition. If you fancy picking up the challenge, I'd encourage you to put a post on the google group and get involved!

Tuesday, 3 January 2017

COOLL dataset released

The COOLL dataset was recently released by researchers at the PRISME laboratory at the University of Orléans, which contains high-frequency from 12 different types of appliances. Similar to the tracebase and PLAID datasets, multiple instances of the each type were measured, and each instance was measured throughout 20 operations. During each controlled operation, current and voltage data was collected at a sample rate of 100 kHz. The dataset is summarised in an academic paper, and can be downloaded from github after filling in a registration form.

Friday, 4 November 2016

Energy Futures Lab talk at Imperial College London

I gave a talk at the Energy Futures Lab at Imperial College London this afternoon, which covered some of the data products which my team at Connected Home is responsible for providing to the rest of the business. Below you can find a summary of my talk:

Smart meters will be installed in 26 million UK homes over the next few years in an effort to achieve the country’s carbon emission reduction targets. Such smart meters will conform to the SMETS2 specification, which allows customers to chose whether to upload daily or half-hourly data to their supplier over the cellular network for billing purposes.

At Centrica Connected Home, we developed the My Energy dashboard for British Gas. This dashboard aims to not only visualise energy consumption, but also to extract meaningful insight from the consumption data. The dashboard offers a comparison of the customer’s consumption against similar homes, and also a monthly breakdown of their consumption into six categories; heating, hot water, lighting, entertainment, cooking and other appliances.

For the similar homes comparison, we rephrased this problem as the following question: “can we predict daily consumption given only the customer’s location and answers to a short survey?” We then built an algorithm to answer this question, and optimised the accuracy of this prediction given the huge dataset of all our customers’ actual consumptions. However, we realised that it was also important to balance single-day accuracy against the day-to-day stability of a single customer’s predicted consumption.

For the energy breakdown, we consider the problem again as a prediction problem, in which individual blocks of energy are detected from half hourly data and assigned to one of the six categories based on a range of features, such as magnitude, duration and time of day. We then optimised the accuracy of the algorithm using data collected from the Household Electricity Survey, which measured the consumption of individual appliances in addition to the total household’s consumption.

In addition to My Energy, Connected Home is probably best known for developing the Hive ecosystem of products, including Active Heating, Lighting, Motion Sensors, Door & Window Sensors, Smart Plugs and Boiler IQ. I’ve chosen to focus on Hive Active Heating in the rest of this talk, given that it’s the product that I’ve spent most time working on.

Hive Active Heating is a connected thermostat that allows customers to control their heating from their phone. However, Hive doesn’t instrument the boiler directly, but instead sends control signals to the boiler based on the ambient temperature of the home and the customer’s desired temperature. We’ve recently been experimenting with the possibility of detecting boiler failure from this limited set of features. Such a failure might consist of Hive asking the boiler to heat the home, followed by a decrease in the ambient temperature (rather than the expected increase in temperature). While this algorithm is still in its early stages of development, it illustrates a clear possibility to turn a connected product into a truly smart device.

In conclusion, I believe that smart meters offer huge potential to give customers insight into their energy consumption. Furthermore, I see real potential in the Internet of Things market, not only in connecting everyday appliances to the Internet, but also by enabling the insight and automated control which transforms them into smart appliances.

Monday, 31 October 2016

Machine learning & the connected home: useful data or meaningless noise?

Later this week I'll be giving a talk at the Energy Futures Lab on the recent work we've been doing around the data collected from smart meters and the Hive ecosystem. Here are the important details:

  • When? - Friday 4th November 2016 at 13:00
  • Where? - Room 116, South Kensington Campus, Imperial College London
  • Registration - Free, via EventBrite


The ongoing national rollout aims to deploy one smart meter to every home in the country by the end of 2020. Furthermore, the market for connected household devices, such as automatic and remote controlled heating and lighting, continues to grow each year. As a result, more and more data is being collected about how we live our lives. This talk explores how machine learning can be used to extract value from this data improve our lives, but also highlights the importance of customer consent and keeping this data secure.


The seminar will be held in Room 116 of Electrical and Electronic Engineering (building 16 on the campus map). The room is known as the Energy Futures Lab teaching area (or The Bunker). If you are entering the building from Dalby Court/through the building's main entrance the room is down a flight of stairs, through the double doors on your left hand side, turn right at the end of the corridor and it is the second door on your right.

Thursday, 20 October 2016

EPRI EU NILM 2016 review

"NILM is a challenging problem. But it’s not impossible." This was one of Prof Bin Yang's five concluding messages of his keynote talk at EU EPRI NILM 2016. I think this sums up pretty much every NILM researcher's feelings, but lets face it, there wouldn't be much much need to hold such a conference if NILM was an easy problem. In his talk, Bin started with the fundamentals of source separation, and went on to relate energy disaggregation to a number of other separation problems, including image segmentation and speaker diarisation. Our other keynote talk was given by Chris Holmes of EPRI, who described their motivation for co-organising the workshop, as well as their recent work in evaluating a large number of NILM vendors in both North America and Europe.

In addition to the two keynote talks, the workshop featured 25 other talks from NILM vendors, academics and utilities. The workshop was attended by roughly 100 people across the two days, from countries far beyond Europe, including Japan and Korea. While we did our best to live stream the event, bandwidth limitations affected the audio and video quality on the first day, but the stream made a miraculous recovery for the second day. A playlist of (most of the) individual presentations is available on YouTube:

I personally really enjoyed the event, and given the attendance numbers and feedback we've received so far, I think it confirms the demand for NILM-focused events on both sides on the Atlantic. Having said that, I hope to see you all next year, and don't forgot to bring your NILM bingo card ;)

Update 21.10.2016: replaced live stream videos with playlist of individual presentations.

Monday, 17 October 2016

EPRI EU NILM 2016 livestream

The stream of day 1 of the EPRI EU NILM workshop is now live:

Update 21.10.2016: A playlist of individual presentations is now available on YouTube:

Unfortunately the lack of bandwidth at the venue meant only 19 of the 27 talks were of acceptable quality. Apologies to the speakers to the other 8 speakers!

Tuesday, 27 September 2016

A competition for energy disaggregation algorithms

Cross-posted from Jack Kelly's blog:

Now that I've (finally!) submitted my PhD thesis, I can focus on designing and implementing a competition for energy disaggregation algorithms. EDF Energy have kindly given me post-doc funding from now until the end of December 2016 to work on the NILM competition.

The broad plan is to first consult with the NILM community and create a specification for the NILM competition which works for everyone. Then I plan to implement a web application which can run the NILM competition.

Right now, I'm writing a survey on the design of a competition for energy disaggregation algorithms. The aim of the survey is to systematically collect feedback about the design of the competition. I plan to launch the survey on the morning of Friday (30th September). Before Friday, I'm really eager to hear feedback on the survey itself. For example: is the survey missing any vital questions? Do some questions not provide sufficient options? Do some questions not make sense?!

Please note that, prior to Friday, the aim is to get feedback on the design of the survey itself. So please don't actually submit any answers yet! I'll write another blog post when the survey is ready to accept answers.

It's probably best to provide feedback about the survey in public on the relevant thread on the Energy Disaggregation Google Group. If you want your feedback to be private then, by all means, email me directly at!

And please do get in touch if you have feedback on any aspect of the proposed NILM competition.