Re: "abundant data by itself solves nothing."
I work for a company that deals in data mining of clinical systems. Basically, depending on what data you have (and what data you're willing to export to our system [and drop the $$ for accordingly]), you'll be able to write queries against the data which may be from completely independent systems that would allow you to connect dots that you would not normally be able to see in each disparate system on it's own. In theory, it sounds cool - and when it works correctly, it is.
- HOWEVER -
This requires quite a few dots to be connected internally before such queries can be written:
(i) The source system(s) must be capable of providing data to our systems (i.e. Hello, IoT) - Your systems have to be producing data for us to slurp up in the first place, be it the hospital EMR, heart/blood pressure/oxygen/etc monitors, barcode scanners, and so on.
(ii) The source system(s) must be administered by people who are empowered to affect change in the organization - If your hospital's IT department can't get the C-suite to approve a simple maintenance contract for your critical literally-life-or-death systems, good luck getting them to agree to the cost or effort required in setting up an effective "big data" measure.
(iii) The source system(s) must have users who can be considered subject-matter experts (SMEs) on how the systems work and what the data contained in them ~means~ - These are the people who are going to be able to validate that the data is moving from the source systems into the data mining solution correctly. This is where the experience part of the existing systems and user workflow comes into play.
(iv) The source system(s) must be capable of providing the data to be analyzed - If you don't tell me when a patient enters or leaves a hospital, I can't tell you how many people came and went through admissions in a month. Sadly, this may be the most straightforward of the points being made.
(v) The destination system must be capable of handling the data in a meaningful way - Let's face it, most of the data hoovered up by these big data systems is useless noise, loosely structured at best. Those familiar with healthcare systems will have at least heard of the HL7 standard, which suffers from the same level of bloat that all standards tend towards - systems that are built to handle HL7 messages are unwieldy, disgusting monsters by nature due to how loosely the data is structured. Put simply - the destination system has to know how to interpret the garbage being fire-hosed at it in a way that provides some level of value.
So assuming you have technical requirements (i) - (v) met, you now have more problems:
(a) Unless you know what you're looking for, all that data is useless - If you've ever needed an example of the phrase "the more you know, the less clear things become", then this situation fits perfectly. You need to bring the aforementioned SMEs together at this step to interpret the data being collected, and build a data model that can actually provide some level of value for the people who are footing the bill for the project. If they can't, then the project needs to be shelved and re-evaluated, because reaching this point without a clear idea of whether the data collection put in place will do what the big data system was brought in for is a huge red flag, and the objectives need to be re-evaluated.
(b) Likewise, the culture around the system should be collecting ONLY what is useful, and discarding the rest - extraneous data that is collected only increases the chances of false positives and the signal/nose ratio in general. Demographics is an example - you may care about whether a subset of the population has a particular affinity for dill vs. sweet pickles, but odds are you won't. Don't collected that if it's meaningless -now-, even if it *might* be useful later.
(c) The people receiving the reports filled with these data points need to care - As was (once again) mentioned before, the people who are getting the output of these projects need to have some level of buy-in into the systems. There needs to be a clear picture drawn by the results of a big data project on how it will impact the organization, otherwise they'll just shrug their shoulders and all the work was for naught.
At the end of the day, these systems need to be connected to actual delivered value (i.e. increase in the bottom-line) in order to be considered anywhere near a success (and thusly justifying the expense) where it counts. Getting all these ducks in a row in an ideal scenario is unlikely enough, once you roll the theory out to reality, it's easy to see why big data projects aren't really taking off or providing value the way the market has promised it would.