Lies and statistics
Oh I am sure that they will get better, like they have at detecting some cancers, but its a bit like those self driving Teslas - you just never know when they might decide to kill you.
Neural networks that analyse electrocardiograms can be easily fooled, mistaking your normal heartbeat reading as irregular or vice versa, researchers warn in a paper published in Nature Medicine. ECG sensors are becoming more widespread, embedded in wearable devices like smartwatches, while machine learning software is being …
But how much of the Cancer detection improvement is better samples curated by experts, how much is cleverly chose results to market it? How much genuinely independent validation?
Also how are future experts trained if we totally give such things over to big corporations touting their software.
I had a CT scan last year. I was told that due to a shortage of people qualified to interpret the images, I would not get the result for 3 months. ISTM that a computer analysis that could be done within minutes of having the scan may well have fewer false negatives than the problems caused by waiting 3 months for a more accurate diagnoses.
So, they trained a CNN that is easy to fool, then used that to say that other algorithms might be susceptible, without testing those other algorithms?
Don't get me wrong, the CNN they tested was a good one, but it's potentially not what is in products, and knowing that often a lot of additional signal processing is required when picking up ECG from places like the wrist instead of across the chest it's dangerous to extrapolate the results to current products.
(AC as the day job is for a company who makes such a device...)
I wouldn't rely on any piece of kit doing a clinical job that is produced as a consumer item.
There is a conflict between producing a marketable item that makes a profit and something that produces highly accurate results and has guarantees on performance.
Are there any clinical standards written into law for smart wearables that can supposedly monitor your health?
Because if there aren't, there should be.
Eventually, production of such things is going to be so routine that something of clinical quality will be become consumer kit. It happened with thermometers; I don't see why it shouldn't happen with other monitoring devices (although I guess having fewer electrodes, and placed for convenience rather than effectiveness, will limit wrist ECGs; but the problem won't be production quality or price).
(Likewise, in a non-clinical area, GPS has gone from being specialist equipment to everyday consumer kit.)
"I wouldn't rely on any piece of kit doing a clinical job that is produced as a consumer item."
I understand the thought. But it's not at all clear to me that consumer grade thermometers, sphygnomanometers(blood pressure measuring devices), blood glucose meters, blood urate meters, etc are much if any less reliable overall than their clinical equivalents. It's true that the consumer products may have some corners cut in design. On the other hand, most of the things being measured depend on time of day, health, when and what the last meal was, etc, etc,etc. A home user is typically much better equipped to provide a consistent test environment than is a physician's office where readings are likely to depend on when the appointment is, how stressful the trip to the physician's office is, who makes the measurement, and in some cases how they make it.
In any case consumer grade measurements are orders of magnitude cheaper than lab measurements and are likely to reveal patterns of physiological behavior that won't be available to physicians unless they slap you in a well staffed, well equipped Intensive Care Unit for a few days. They're likely to be the first health screen that most folks get -- especially in developing countries and developed countries like the US with dysfunctional healthcare systems. So I suspect that consumer grade devices are always going to be an important element in healthcare. It's important that they work reasonably reliably and that their limitations are well understood.
Clinical standards exist *if* the manufacturer wants to make medical claims. In most cases, like with Apple, they don't want to go through the hassle of clinical approval, because it takes years, so they just say it is non-clinical.
The big issue with getting regulatory approval to make medical claims is that every time a software / firmware / hardware change is made, it has to go back through testing and certification, it can add months to your release cycle.
These are mostly good points, but a few things need to be taken into account when considering how this study applies to technology being used in real life:
1. They managed to trick their own model. They don't know how to trick the models being used in tech, which probably have more samples. Considering how neural networks work, it's probably not difficult to trick the models in those devices, but they still don't know how to do it.
2. Even though they were able to trick the model, they were able to do so because they passed the data directly to the model. How would a theoretical attacker manage to pass misleading information to a monitor that is physically on your body without you knowing they were doing it?
3. What are the risks of devices using neural networks and being fed improper information? For consumer devices (mostly watches), the risk is that they call emergency services. I believe they do alert the user before doing so as well, so the user could cancel that.
4. What motivation is there for a malicious party to fake an ECG reading? It might be an interesting attempt to prove death by natural causes in a murder situation, but I doubt it's easy to murder someone and have it look like a death from heart attack short of certain poisons that effectively cause a real heart attack in which case you wouldn't need to fake the device.
So while the tech could be fooled, it probably is neither as easy nor as dangerous as it may sound. The real issue to consider is how likely these devices are to produce a false positive without someone malicious fiddling with the data. If there is a risk in these algorithms, it will happen when they think that a heart attack is happening when it isn't, or more likely when they miss an attack that really happens. I don't know how likely that is to happen--I don't have such a device--but that's the metric that will help us decide how dangerous or unreliable these devices really are.
I don't know about the UK but in the US the Food and Drug Administration certifies"medical devices" which are marketed for home use. As a result, the iWatch is the only certified ECG although others like Samsung are going through the process. No other real or rumored watch functions have achieved certification that I know of.
Many diabetes sufferers rely on certified home blood sugar monitors which, for some, are life and death decisions.
While this particular instance seems a bit over hyped (like many), I would not be too quick to discount AI in medical devices.
As someone with an anomalous heart condition (Brugada Syndrome - BrS - or sudden unexplained nocturnal death syndrome - scary, I know) I can say that I would never just rely on a cheap device sensing in an unreliable position. When I go for checkups the monitors they use are located on my chest and far more reliable. I will keep an eye out, though, as if the monitors do become more reliable I may go for one just as an added check. Perhaps, and it's a big perhaps, once could have saved my uncle who died as a result of BrS.
Only could be useful if you could get the data out of the unit in a usable form, and if that form was reasonably close to a known standard EKG output. Might be a nice additional piece of telemetry when used along with a holter monitor or the like, but not as a primary data source.
Though, heart attack as listed in the title and a-fib? Kinda different things, though there's some crossover with thrombus formation tying them together.
"Only could be useful if you could get the data out of the unit in a usable form, and if that form was reasonably close to a known standard EKG output."
Totally agree - that's why I'll just keep an eye out and still never depend on it. The idea of an over-cautious alarm that goes off if my heart stops in the middle of the night for no reason does kind of appeal. Read up on BrS - it's a bit of a worry - there have been villages in SE Asia where multiple people have died overnight for no reason seemingly. It's these stories that inspired Wes Craven to write A Nightmare on Elm St.
The idea of an over-cautious alarm that goes off if my heart stops in the middle of the night for no reason does kind of appeal.
Although surely only of any use if there is someone nearby who can do CPR as soon as the alarm goes off? Otherwise its only benefit would seem to be to allow the coroner to put a more accurate time on your death certificate ...
In this case, is there any point? I can confidently, instantly, and correctly identify the examples given from across the room. The ECG is a millivolt signal which in the real world always has some perturbations - we are used to this. This particular example is a disastrous performance for the AI.
I have to taken an ECG every other year as part of my job, and after about 10 seconds the machine spits out a trace along with an automated "diagnosis".
The effect that relatively minor things can have on this wretched device is remarkable. So far the list of stuff that screws with the measurement includes my phone; a Garmin Fenix watch; fluorescent lights; LED lights; trains going past; cars starting their engines... in the process the machine spits out guesses which include long QT syndrome, atrial enlargement, ventricle enlargement, various blocks, ectopic beats - you name it, this stupid thing thinks I've got it.
Fortunately the doctor administering the tests is a retired cardiac expert and he just keeps hitting "Retry" until he gets one that he's happy with. Superficially though they all look identical to me, but it doesn't help relax you when you see these things popping up on the screen.
"Is this really a problem, or a proof of concept with very limited applications and concerns?"
Somewhere in between perhaps. Some physiological measurements are simple and straightforward. Temperature for example. Or pulse rate. Others are judgment calls. Systolic blood pressure (the second number in BP measurements) for example. I'm told that some folks don't genuinely have any well defined systolic cut off. Electrocardiograms definitely seem to fall into the judgment call category. Providing me with a chart of my EC wouldn't do much good. I don't have the slightest idea how to interpret it. And neither do most other people I'm pretty sure. I actually read up on that once. And I concluded that it'd take a lot more training than I have any intention of getting for me to make sense of ECs. So having a cheap device that can check it might be useful. But the device has to work properly for most people most of the time. And it'd help if it knew when its readings are unreliable.
Peak fitting is a great example of something that humans can do incredibly well, looks like it should be simple to program, and yet somehow is extremely difficult to actually get a computer to do. We deal with a similar problem in my work, with particle physics rather than biology. If you have a good idea of where a single peak will be it's not too hard to find it, but given a trace with an unknown number of peaks in unknown locations, it can be almost impossible to find them reliably. It's especially annoying when you have a bunch of traces that look virtually identical, and the computer can fit a couple just fine while failing horribly at the rest.
I don't know that I'd expect a neural network to do a better job than traditional fitting methods, but the fact it can be fooled in this manner doesn't say it's any worse either. The attack they use here is exactly the sort of thing that could cause our own tools to fall over as well. The main concern I'd have is the usual one that comes up with machine learning - with normal fitting methods, you can figure out why the failure happens and fix it. Not easily, but it can be done at least in theory, and will usually give insight into a variety of other issues that could crop up. With machine learning you never know exactly why it fails, and even if you fix it by adding more items to the training set, you don't know how that actually helped or if it would be any use against other problems.
As a programmer and engineer, I really do wonder why this is difficult to code - and why anyone would try to use a neural net anyway. Not being a doctor, I'm not sure of the exact significance in the graphs, but I do see that the AF plot shows a slight drop below average immediately after each peak, while the normal plot has a longer, positive hump a little while after each peak. Are these the criteria that a doctor is trained to look for, and if so, why not hardcode it specifically for that pattern rather than using a neural net?
If software and computer companies, the people who we expect to know the most about computers, disclaim any warranty of the correct performance of the software, and limit any liability from damages; why should we trust our lives to computers?
Word does not guarantee that it works correctly and as intended in its role as a word-processing software. But, I should trust any gizmo to reliably detect a medical condition? I am not convinced.
except at no point did the paper actually test any of the consumer products on the market. They took a 3 year old CNN from a conference challenge, and fed it noisy data that it had never been trained on, and say that is proof that consumer products probably can't handle the noisy data either...
The ECG and heart monitoring functionality on the Apple watch is overloaded (to the point of annoyance) with disclaimers saying “This product does not detect heart attacks.” Of course, actual facts like that never protect against cheap shots from El Reg, which is incapable of detecting facts when it comes to Apple.
It is easy to spoof blood pressure results for many people. Put them in a medical environment to do the test e.g., hospital and their blood pressure will increase. Having such measurements taken outside of that environment is really useful provided that the instructions for operating such measuring devices is strictly adhered to.
When I look at the two traces in the figure illustrating the adversarial attack, I immediate see the irregularity of the first trace and the structure in the second. I am not a medical professional, but it makes perfect sense to me that the first looks problematic and the second seems healthy.
I am also no professional programmer, but I could pull together a function to identify the deviation of the 'healthy' pattern from a healthy one in a few hours. Why would anybody use AI to address this issue?
Looks like a case of: "if all you have is a hammer, everything looks like a nail".
That you can "immediately see" the difference is proof that a neural network is pretty good for this sort of pattern detection. You might struggle to quantify it but you know it when you see it,
And no, you couldn't pull a function together in a few hours. You could maybe do so (maybe) for that one particular trace but it won't work on 99% of the other traces. Whereas your eyes could.
They simply aren't as accurate as what you get in a doctor's office or a hospital because those measure at many more points. All they can do is tell you "this looks odd, you may want to get it checked by a professional". The fact that a neural network is interpreting the results instead of a doctor is not the reason why you shouldn't rely on its diagnosis.
Before long I'm sure some of the full on 12 lead hospital ECGs will have some neural network as well - to filter out all the false alarms those generate. The machines support filtering out certain types of alarms but nurses don't do it, because they're afraid of getting blamed if something real is missed. But if the software is doing it, no "person" is to blame. Sure, a few people might die as a result of something being missed, but overworked nurses constantly responding to false alarms costs lives as well, so you gotta pick your poison.
Biting the hand that feeds IT © 1998–2020