
Hmmm
A few of the potential problems with this sort of network traffic analysis:
1) where do you capture data in a resilient multi-path enterprise network.
2) how do you ensure precise timestamps to correlate a connection's packets on different in/out paths.
3) how do you correlate multiple TCP parallel connections belonging to the same client application/browser transaction.
4) how do you ensure correct interpretation of retries or load-balanced out-of-order packets.
5) how do you distinguish HTML requests that worked and those that were abandoned by the browser.
5) how do you "render" client displays composed of very dynamic content.
6) how do you recognise the application on browser HTTPS connections.
It would be interesting to know the confidence factor in a real world enterprise use.