Monday 19 December 2011

NIPS - Day 6

This was the final day of the NIPS conference and the day of the workshop on Machine Learning for Sustainability; the workshop I submitted my paper to. It's a bit of a shame it came on the day when I think everyone was the most tired, but I think that's just a reality of post conference workshops. In general, the workshop mostly covered three main topics: climate modelling, energy and environmental management.

For me, the best part of this workshop was the chance to meet other researchers working on the problem of energy disaggregation. The workshop was organised by Zico Kolter, with an invited talk given by Mario Bergés, both of whom have published in this area and their papers have ended up on my reading list.

The morning invited talks were:

  • Mario Bergés - Machine Learning Challenges in Building Energy Management: Energy Disaggregation as an Example
  • Andreas Krause - Dynamic Resource Allocation in Conservation Planning

Before the midday break were the spotlights and poster session. These sessions were my first chance to present my work to the an external academic community and receive feedback, which was an absolutely invaluable opportunity. I tried my best to balance my time between presenting my poster to others and also to discuss the other posters with their authors. I was really impressed with the quality of the accepted papers, and hope this is indicative of the future of the MLSUST workshop.

The afternoon invited talks were:

  • Drew Purves - Enabling Intelligent Management of the Biosphere
  • Claire Monteleoni - Climate Informatics
  • Kevin Swersky - Machine Learning for Hydrology, Water Monitoring and Environmental Sustainability
  • Alex Rogers - Putting the “Smarts” into the Smart Grid: A Grand Challenge for Artificial Intelligence
The concluding session of the workshop was a panel discussion regarding the future of MLSUST. Much of the discussion was centered around the naming of the workhop, and whether sustainability represented the variety of topics in the workshop, but also whether the term encouraged researchers from other fields to submit and attend. Other ideas to promote interest in the workshop were raised, namely: well defined problems, competitions and data sets.

For me, the workshop was a brilliant venue to receive feedback on my PhD work, and for that I owe the organisers thanks.

Friday 16 December 2011

NIPS - Day 5

I woke up on the first day of workshops to a beautiful sunrise in Sierra Nevada. I'm not sure what it is about mountain weather that causes the sky to turn so pink, but I'm definitely not complaining.

The NIPS organisers seem to have set it in stone that the each workshop will have a 5 hour break in the middle of the day for sleeping/skiing. As much as the 7.30am workshop starts hurt, I have to say I thoroughly enjoyed the opportunity to ski. I'm not sure that the schedule helped to sync the body clock with Spanish eating times, but I think a lot of people are jet lagged enough for that not to matter.

I attended the workshop on Decision Making with Multiple Imperfect Decision Makers. The two talks I enjoyed the most were:

  • David Leslie - Random Belief Learning
  • Stephen Roberts - Bayesian Combination of Multiple, Imperfect Classifiers

I also spoke to Edwin during the coffee breaks about how crowdsourcing is used to aggregate multiple data sources with of different quality, reliability etc. This prompted the thought that maybe something similar could be used in NIALM to collect large sets of appliance signatures. In my opinion, this would make a far more powerful data set for building generalisable appliance models than it would be for evaluating the performance of NIALM approaches.

NIPS - Day 4

Today was the final day of the main track, with only a morning of talks scheduled. Apart from some interesting but less relevant talks about the brain, I thought this one was pretty good:

After the morning sessions was the NIPS organised tour of the Alhambra. Moving over 1000 people around in coaches was always going to be a logistical nightmare, so it was no surprise loading/unloading took longer than planned. The tour of the Alhambra was excellent, and I've put a few photos at the bottom of this post to prove it. After the tour we set off up the mountains towards Sierra Nevada for two days of workshops. As the gradient increased our bus started to struggle, and when the road widened to two lanes the other buses stormed past us. However, the bus did its job, and we arrived in Sierra Nevada not too long after the others.

Wednesday 14 December 2011

NIPS - Day 3

Today was another great day at NIPS, although I think the long days are starting to take their toll. The sessions were pretty cool, with some some great invited talks. I even noticed a few crowdsourcing papers making their way into NIPS, which I've already passed on to people in my group who might be interested.

Instead of the standard restaurant lunch, I decided to try to make the most of my last full day in Granada and go on a walk around the old town. There was so much stuff I would have missed I had not have strayed far from the conference venue, so really glad I made the effort. I took a few photos on my phone which I've stuck at the bottom of this post.

In the poster session I found one paper that was particularly relevant to what I do:

I think the method used for deciding whether to predict outcomes in financial markets is similar to the way in which our approach ignores observations from other appliances described in our paper. Their method of expanding single states of the Markov process to add more detail was also really interesting. I'll have to read their full paper when I get a little more time.

Tomorrow morning contains the last of the sessions at the main conference, followed by a visit to Granada's famous Alhambra. We then make the bus journey up the mountain to the workshops is Sierra Nevada.

Tuesday 13 December 2011

NIPS - Day 2

Today was the first full day of NIPS, consisting of many spotlight and oral sessions. This has been my first experience of spotlight sessions, and I've been really impressed at how many new techniques I'm introduced to and how many thoughts about my own work it triggers. I have to say I much prefer it to the topic sessions I experienced at IJCAI.

I also learnt today that there's three identical (I think) Machine Learning Summer Schools (MLSS) held in different parts of the world. I attended the European Agent Systems Summer School (EASSS) 2011, and although it was really helpful in understanding other research in my group, it wasn't too relevant to my PhD. I think I'll look into MLSS to see whether it will be beneficial for me to attend it in 2012.

After attending the poster session I've also learnt the following:
  • A0 is the best poster size (at least at NIPS). A1 posters are too small by comparison, while bigger posters require kneeling to read the bottom.
  • People (or at least just myself) are most likely to view your poster if both of the following are true:
    • a) something in your title interests them and
    • b) your abstract confirms this. This might seem obvious, but it's interesting when you consider this along with acceptance likelihood and post publishing usefulness when writing your title and abstract.

Apart from this general conference waffle, I also came across the following interesting posters:

Monday 12 December 2011

NIPS - Day 1

After a long day of travelling yesterday, I arrived in Granada for NIPS2011. NIPS stands for Neural Information Processing Systems, and is widely regarded as the top machine learning conference. I'm here to present my paper, although that's not until the sustainability workshop on Saturday.

Today I attended 3 tutorials:
I really enjoyed each one, as the quality of presentation resulted in some really interesting and accessible tutorials.

I also went for lunch with a bunch of PhD students from the machine learning labs at Oxford, Cambridge and UCL, which was great to meet them and get a feel for what problems they're studying. We ate some interesting Paella, but I don't think it was quite good enough to warrant a return trip tomorrow.

After the welcome talk, there was a tapas reception and the first poster session. I spoke to many, many people who were using HMMs or similar graphical models in their work, which might even lead to some extensions to the paper I'm currently writing...

Thursday 1 December 2011

REDD Statistics

To allow the empirical comparison of different approaches to NIALM, a team at MIT built the Reference Energy Disaggregation Data set (REDD):
Described in the paper:
Kolter JZ, Johnson MJ. REDD : A Public Data Set for Energy Disaggregation Research. In: Workshop on Data Mining Applications in Sustainability (SIGKDD). San Diego, CA; 2011

A few months ago I wrote a post giving some high level details about the data set, and since then, I've spent some time using it to benchmark various approaches. However, I've made many mistakes along the way, which were often due to lack of understanding of the data set. Due to the vastness of the data it's often hard to understand simple results, such as which appliances consume the most energy, or how long each house has been monitored. To tackle such confusion, I set about calculating a bunch of statistics and generating visualisations which I decided to share with the world through this post. For all the statistics below, I used the low frequency data (1 reading per second).


The data set contains 6 houses, for each of which I generated the following statistics:

House  Up Time (days)  Reliability (%)  Average Energy Consumption (kWh/day)
1 18 99.97 9.22
2 14 99.98 5.53
3 16 99.87 10.7
4 19 99.99 8.42
5 3 99.86 14.9
6 10 97.50 11.4
  • Up time - duration for which mains power measurements are available at 1 second intervals
  • Reliability - percentage of readings which are available at 1 second intervals
  • Average energy consumption - the average energy consumed by each house's mains circuits
These statistics were calculated to give an overall view of the data collected for each house. However, I noticed that each house had data collected over different periods. Below is a plot of the up time of each house's mains circuit monitor:


Each house contains between 11 and 26 circuit meters, each of which monitor one of the two phases of electricity input or one of the house's internal circuits. In my opinion, the importance of disaggregating an appliance varies according to its energy consumption, and therefore it's helpful to know how much energy each appliance type generally consumes when it is present in a house. This led me to generating the following statistics:

Appliance  Number of Houses Present In Average Energy Consumption (kWh/day) Percentage of Household Energy
Air conditioning 2 1.62 0.5
Bathroom GFI 5 0.15 5.4
Dishwasher 6 0.24 4.5
Disposal 3 0.00 8.6
Electric heater 3 0.62 1.3
Electronics 3 1.00 9.0
Furnace 3 1.32 8.6
Kitchen outlets 6 0.68 4.5
Lighting 6 2.02 4.5
Microwave 4 0.33 6.5
Miscellaneous 1 0.03 0.0
Outdoor outlets 1 0.00 2.7
Outlets unknown 4 0.37 6.7
Oven 1 0.27 0.0
Refrigerator 5 1.58 5.4
Smoke alarms 2 0.01 11.6
Stove 4 0.09 0.3
Subpanel 1 1.52 2.7
Washer/dryer 6 0.52 4.5

A problem with the average energy consumption and percentage of household energy consumption columns is that although an appliance might be present in a house, it might not be used. This has a major affect on the average given that we only have 6 houses.

Houses and Appliances

A more reliable source of information is to consider each house individually, and examine on average how much energy each appliance consumes.

                                       Average Energy Consumption (kWh/day)
Appliance  4 5 6
Air conditioning

Bathroom GFI0.155818
0.406498 0.026811 0.083836 0.094664
Dishwasher 0.586979 0.214069 0.169982 0.186737 0.260919 0.005581
0.001954 0.002056
Electric heater0.002869

1.342452 0.500762

0.410446 0.116638

0.354939 2.368171 1.248476
Kitchen outlets 1.397504 0.394875 0.348328 1.639216 0.147725 0.181511
Lighting 1.850006 0.636832 2.222016 1.474698 3.235403 2.713484
Microwave0.504392 0.367756 0.196976


Outdoor outlets

Outlets unknown

0.160546 0.014136 0.295022 1.010813

Refrigerator1.360893 1.912154 1.126372
1.661399 1.845720
Smoke alarms

0.022473 0.000646

Stove0.012405 0.034468

Washer/dryer0.972989 0.051357 1.876719 0.142617 0.006250 0.063892

I know this is mostly just raw statistics, so I'll look into some visualisations of this soon.