## Wednesday, 29 June 2011

### Using a single hidden Markov model to model multiple appliances

Previous posts have shown how a single appliance's states can be determined from its power readings using a hidden Markov model. This post shows how a single hidden Markov model can be used to determine the states of multiple appliances from their aggregate power readings.

Model definition

The combination of multiple latent Markov chains (representing appliance states) and a single observed variable (aggregate power readings) can be modelled as a factorial hidden Markov model:
A factorial hidden Markov model can actually be transformed to an equivalent standard HMM. Such a model would contain a single chain of latent variables, each of which has as many discrete states as the number of combinations of each appliance's states. For instance, for a FHMM with 3 latent chains of 2, 3 and 4 states each, the equivalent HMM will have 1 latent chain of 24 states. The standard HMM is shown below:

Learning

Before any inference can take place, the model parameters for the combined hidden Markov model must be specified:
• pi - the probability of each appliance state in time slice 1
• A - appliance transition probability matrix
• phi - emission density (probability density of observed variable given a hidden variable)
For training purposes, we assume to have training data for each individual appliance. The model parameters for each individual appliance can be learnt using the Expectation-Maximisation algorithm. The model parameters for each individual appliance must then be combined to a single set of model parameters for the combined hidden Markov model. The model parameters for the combined model are calculated:
• pi - the product of each combination of individual initial probability
• A - the product of each combination of transition probability
• phi - the sum of each combination of emission density
Once these model parameters have been calculated, the model is then ready to perform inference.

Inference

The Viterbi algorithm can be used to infer the most probable chain of latent variables in a HMM given a chain of observations. This is different to sequentially classifying variables according to their maximum probability. This is because the sequential classification method is unable to classify a state which is individually sub-optimal, but leads to a sequence of states which are jointly more probable. Because such classifications are not revisited sequential classification does not guarantee to find the jointly optimal sequence of states.

The complexity of the Viterbi algorithm is linear in the length of the Markov chain. However, its complexity is exponential in the number of appliances modelled. This is because the probability of each transition must be evaluated at each time slice. Since the variables correspond to each combination of all the appliance's states, the number of transitions is exponential in the number of appliances. This is acceptable for small numbers of appliances, but would become intractable for households containing many more than 20 appliances.