Most digital CCTV systems nowadays use a compression system for storing the video (even for realtime viewing, the image is generally compressed, saved and then sent over the network to the operator). This means that the "live" image is of quite poor quality. Couple that with low quality lenses and wide field of vision (i.e. covering a large amount of area) etc, and you get a difficult area to work in as far as automated tracking goes.
Now, before you all think that I'm talking through my arse, I work in the Surveillance Department one of the largest Casino's in Europe, & have done for many years. I've been jointly responsible for implementing the digital system in our casino, so I do know what a difficult task the blob-tracking programmers have.
In our casino, we have as close to a perfect environment as we can - lighting controlled, access to premises controlled, almost ideal placement of cameras, etc. However, we still find it incredibly difficult to track a player from one area to another - MPEG artefacts, large crowds of people in the shot, etc. (Imagine, if you will, a field of heads and hairdo's, as everybody's too close together for you to identify clothing, and you'll begin to get an idea of how hard it is to track a badly focused blob across a crowded room with artefacting).
At least two things need to be done to make this a 'success': 1 - Higher resolution digital imaging and recording. 2 - Systems which need to be able to re-acquire a 'blob' having lost it, either by checking adjacent cameras to the last one the 'blob' was seen in, or predicting likely paths from previous data. Because, trust me on this, they will lose the person being tracked.
Btw. Easy way to fox this sort of system - electro-chromatic or themo-chromatic clothing (or a nice, double-sided jacket, to be swapped in a darkened, camera-free room).