Monday, 10 April 2017
Jack Kelly recently published the results of a survey which he designed to assess the appetite for a NILM competition. The survey covers a range of topics, from technical questions regarding the sample rate and required features, to practicalities such as where algorithms should run and how often the competition should take place. However, the survey highlights two key issues that make the design of such a competition quite tricky:
Data - As Jack explained in a recent blog post, the collection of a large enough data set is expensive and there is no clear business case for a single organisation to pick up the cost. This is due to the sheer number of sensors per house, duration of data collection and number of houses required. Furthermore, the data set cannot be reused once the data set has been released or even once the accuracy of successive runs of an algorithm have been made available.
Requirements - Almost by definition, every disaggregation company and researcher is tackling the problem from a slightly different angle. Some use a unique sensor, while others require a unique training procedure. Even beyond the differences in disaggregation solutions, there's no clear consensus on issues such as where the competition should run or what training data should be provided.
However, all hope is not lost, given Pecan Street's demonstration that collecting sub-metered data at scale is possible, and also the precedent set by Belkin's competition back in 2013. Furthermore, most participants agreed on a few issues, such as 1 Hz active power being a reasonable place to start.
Jack's post-doc has now come to an end, which means he won't continue working towards running such a competition. If you fancy picking up the challenge, I'd encourage you to put a post on the google group and get involved!
Posted by Oliver Parson at 09:39
Tuesday, 3 January 2017
The COOLL dataset was recently released by researchers at the PRISME laboratory at the University of Orléans, which contains high-frequency from 12 different types of appliances. Similar to the tracebase and PLAID datasets, multiple instances of the each type were measured, and each instance was measured throughout 20 operations. During each controlled operation, current and voltage data was collected at a sample rate of 100 kHz. The dataset is summarised in an academic paper, and can be downloaded from github after filling in a registration form.
Posted by Oliver Parson at 13:52