I suppose that may depend on how exactly all the objects are marked up; for instance, all the objects that make up the dialogue of a film could simply be bundled together as "dialogue" so you control them as one.
Or they could all be separate.
And yes, one of the things that DTS talked about was the ability for extra streams to be sourced from elsewhere and synchronised with the main video, so I suppose it might be possible to do something like that.
More likely, I suspect, would be just swapping all the dialogue for the dubbed version.
(As far as I remember, the specific example I chatted to them about for this was the Eurovision Song Contest; a system like this could allow technically people to choose the commentary of their home country, no matter where they were watching it.)
Of course, while the technical side of the whole thing is clever - especially the ability to effectively downmix to match whatever your speaker combination is, as well as relative levels etc - how much you will get in implementation remains to be seen.
In that screenshot, for example, one of the items has a padlock and was locked out; something that DTS talked about was the ability for certain things to be available depending on what the operator wants, so for instance everyone might get the ability to alter those relative levels of some thing in a sports game, but only people who've paid a premium get to hear the coach, or the goalkeeper.