## Thursday, 1 December 2011

### REDD Statistics

To allow the empirical comparison of different approaches to NIALM, a team at MIT built the Reference Energy Disaggregation Data set (REDD):
http://redd.csail.mit.edu/
Described in the paper:
Kolter JZ, Johnson MJ. REDD : A Public Data Set for Energy Disaggregation Research. In: Workshop on Data Mining Applications in Sustainability (SIGKDD). San Diego, CA; 2011

A few months ago I wrote a post giving some high level details about the data set, and since then, I've spent some time using it to benchmark various approaches. However, I've made many mistakes along the way, which were often due to lack of understanding of the data set. Due to the vastness of the data it's often hard to understand simple results, such as which appliances consume the most energy, or how long each house has been monitored. To tackle such confusion, I set about calculating a bunch of statistics and generating visualisations which I decided to share with the world through this post. For all the statistics below, I used the low frequency data (1 reading per second).

Houses

The data set contains 6 houses, for each of which I generated the following statistics:

 House Up Time (days) Reliability (%) Average Energy Consumption (kWh/day) 1 18 99.97 9.22 2 14 99.98 5.53 3 16 99.87 10.7 4 19 99.99 8.42 5 3 99.86 14.9 6 10 97.50 11.4
• Up time - duration for which mains power measurements are available at 1 second intervals
• Reliability - percentage of readings which are available at 1 second intervals
• Average energy consumption - the average energy consumed by each house's mains circuits
These statistics were calculated to give an overall view of the data collected for each house. However, I noticed that each house had data collected over different periods. Below is a plot of the up time of each house's mains circuit monitor:

Appliances

Each house contains between 11 and 26 circuit meters, each of which monitor one of the two phases of electricity input or one of the house's internal circuits. In my opinion, the importance of disaggregating an appliance varies according to its energy consumption, and therefore it's helpful to know how much energy each appliance type generally consumes when it is present in a house. This led me to generating the following statistics:

 Appliance Number of Houses Present In Average Energy Consumption (kWh/day) Percentage of Household Energy Air conditioning 2 1.62 0.5 Bathroom GFI 5 0.15 5.4 Dishwasher 6 0.24 4.5 Disposal 3 0.00 8.6 Electric heater 3 0.62 1.3 Electronics 3 1.00 9.0 Furnace 3 1.32 8.6 Kitchen outlets 6 0.68 4.5 Lighting 6 2.02 4.5 Microwave 4 0.33 6.5 Miscellaneous 1 0.03 0.0 Outdoor outlets 1 0.00 2.7 Outlets unknown 4 0.37 6.7 Oven 1 0.27 0.0 Refrigerator 5 1.58 5.4 Smoke alarms 2 0.01 11.6 Stove 4 0.09 0.3 Subpanel 1 1.52 2.7 Washer/dryer 6 0.52 4.5

A problem with the average energy consumption and percentage of household energy consumption columns is that although an appliance might be present in a house, it might not be used. This has a major affect on the average given that we only have 6 houses.

Houses and Appliances

A more reliable source of information is to consider each house individually, and examine on average how much energy each appliance consumes.

Average Energy Consumption (kWh/day)
 Appliance 1 2 3 4 5 6 Air conditioning 0.004023 3.236165 Bathroom GFI 0.155818 0.406498 0.026811 0.083836 0.094664 Dishwasher 0.586979 0.214069 0.169982 0.186737 0.260919 0.005581 Disposal 0.001954 0.002056 0.001381 Electric heater 0.002869 1.342452 0.500762 Electronics 2.478051 0.410446 0.116638 Furnace 0.354939 2.368171 1.248476 Kitchen outlets 1.397504 0.394875 0.348328 1.639216 0.147725 0.181511 Lighting 1.850006 0.636832 2.222016 1.474698 3.235403 2.713484 Microwave 0.504392 0.367756 0.196976 0.241313 Miscellaneous 0.028333 Outdoor outlets 0.000882 Outlets unknown 0.160546 0.014136 0.295022 1.010813 Oven 0.266246 Refrigerator 1.360893 1.912154 1.126372 1.661399 1.845720 Smoke alarms 0.022473 0.000646 Stove 0.012405 0.034468 0.204273 0.103160 Subpanel 1.524733 Washer/dryer 0.972989 0.051357 1.876719 0.142617 0.006250 0.063892

I know this is mostly just raw statistics, so I'll look into some visualisations of this soon.