Simulated houses
Houses with < 10 appliances
Instrumenting a house with appliance sub-meters is intrusive and expensive, so often only a subset of appliances are monitored. Generally in this case, an artificial aggregate is then calculated by summing the power demand of each appliance, which is subsequently used as the input to the disaggregation algorithm. However, since the difficulty of disaggregation increases with the number of appliances, disaggregating this artificial aggregate is generally much easier than disaggregating the household's true aggregate.
Training data
Unfortunately for us working on disaggregation algorithms, appliances of the same type can vary quite a lot from house to house. As a result, a lot of research sidesteps this problem by requiring training data from the house in which disaggregation will be performed. Training data normally comes in the form of sub-metered appliance data or a training phase in which appliances are operated sequentially and manually labelled. However, collecting training data clearly will not scale at the same rate that smart meter deployments have done.
Known appliance types
Even worse than not knowing what model of appliance is in each house, is not knowing which appliance types are present in each house. This is the most reasonable of these four assumptions, since it is conceivable a household's occupants might be required to enter this information if they're interested to see disaggregated data. However, I can't imagine many non-enthusiasts would be willing to (accurately) list all the electric appliances in their home.
I think the first two problems have largely been solved by the release of many public data sets over the past few years. However, I think the last two problems are up to each individual researcher to ensure they're studying a realistic scenario. The field of energy disaggregation is growing at an incredible rate, and I think now is the time to tackle the complete problem rather than to only study its individual components.