DeepHeart: A Neural Network for Predicting Cardiac Health

nharada · on July 26, 2016

Applying convolutional nets to one dimensional time series data is a fairly common problem in medicine, and the techniques here can certainly transfer to other medical domains. Besides the plethora of opportunities in one dimensional signals (ECG, PCG, PPG, EEG, etc), many medical problems pose similar challenges to machine learning techniques. For example:

* Massive class imbalance -- we have much more data for healthy patients than for sick patients, especially at the scales required for deep learning.

* Heavy amounts of noise -- Medical imaging is a difficult feat and noise is a fact of life. Not only that, but data is hard to come by and some noise in the labels is likely.

* Long term outcomes -- trying to predict diseases or other long term results from signals is very difficult, especially when the outcome is not immediately obvious by looking at the input.

TLDR: Imagine that you have to compete in ImageNet, but the images are full of noise, half of the ones labeled "dog" are actually some sort of bird, and instead of guessing what's in the image, you're given an image and then asked to guess if that image, when painted in watercolor, will make a baby smile.

jisaacso · on July 26, 2016

The ML community has techniques for building methods to address low signal to noise, class imbalance, noisy labels and smaller sample sizes. I definitely agree, all of these are issues in medical datasets. Part of the exciting challenge at the intersection of medicine and machine learning is around scaling data collection while respecting patient privacy.

zump · on July 26, 2016

Agreed. The dearth of solutions for 1D time series modelling in deep learning is surprising.

visarga · on July 25, 2016

It would have been interesting if it were able to do prediction based off a mobile phone sensor (like the heart rate apps).

jisaacso · on July 25, 2016

I definitely agree! The challenge is collecting a large number of heart recordings from mobile phones, along with professionally diagnosed abnormalities. That's one reason this physionet dataset is so valuable.

evancasey · on July 25, 2016

ICYMI: cardiogram is doing this with the apple/android watch (possibly other devices). they're not using phone sensor data yet to my knowledge though...

brandonb · on July 25, 2016

Thanks for the shout-out! We posted a few example normal and abnormal heart rhythms from Apple Watch here: https://blog.cardiogr.am/what-do-normal-and-abnormal-heart-r...

Joe: this is really cool work! Happy to chat if you'd like: brandon@cardiogr.am.

lbhnact · on July 25, 2016

I've seen you post before and I think you guys are doing really interesting work.

[edit - noted attribution to wrong author, after a long day!]

brandonb · on July 26, 2016

I know the feeling. :) Thanks for the comment and the earlier warning.

aschearer · on July 25, 2016

People may also be interested in the Framingham Heart Study[1] which among other things generated the following calculator:

http://cvdrisk.nhlbi.nih.gov/calculator.asp

[1]: http://www.framinghamheartstudy.org/

shas3 · on July 25, 2016

I know I may be asking to go down a rabbit hole, but:

(1) Does the NN use a deep architecture?

(2) How does the CNN performance compare with other algorithms?

(3) Have you looked at how performance varies with different features, frequency cut-offs, etc.?

jisaacso · on July 26, 2016

Hey thanks for the great questions!

(1) The NN uses two convolutional units and a fully connected softmax layer. Relative to Inception V3 or highway networks, this is _not_ a very deep architecture. I was looking for a balance between accuracy and training time (trained on my MBP).

(2) I looked into other neural architectures (LSTM and several fully connected) without much difference in performance. If you take a look at the physionet google group, there are a number of other methods evaluated (logistic regression, SVMs, etc.)

(3) I did vary hyperparameters and saw a dropout of ~0.45 and frequency cut-off of 4Hz performed the best for this specific architecture. That said, I imagine the best performing features would be a concatenation of the output from several filters across a range of thresholds. Then the burden of deciding feature importance falls onto the learner.

http://arxiv.org/pdf/1507.06228v2.pdf http://physionet.org/challenge/2016/#forum

known · on July 26, 2016

Check ST elevation/ST depression in https://en.wikipedia.org/wiki/ST_segment

sscarduzio · on July 25, 2016

This is a great idea. Well done!

Can't wait we figure out a way to collect these kind of signals in a scalable way.

Technology is great, but data is king.

jisaacso · on July 25, 2016

Thanks! I definitely agree that collecting these signals is difficult to scale.

dookahku · on July 26, 2016

How did you find this data, or this competition? I wouldn't mind trying my hand at this, either.

jisaacso · on July 26, 2016

All of the data can be found on the physionet site http://physionet.org/physiobank/database/challenge/2016/

andrewfromx · on July 25, 2016

this reminds me of http://rindexmedical.com and they have some device like this or some patent...

et2o · on July 26, 2016

I hate to be a jerk, but... what is the point of this entire competition? Physionet audio recordings of heart murmurs? Stethoscopes are on their way out.

Cool model, I guess. Does it get us anywhere new? I don't know.

jisaacso · on July 26, 2016

Take a look at the challenge homepage, http://physionet.org/challenge/2016/

tl;dr ECG is much more accurate but requires a careful, controlled environment. If it's possible to accurately predict heart abnormalities from sound, cheaper, easier to run screening tools can be developed for at-home use.

signa11 · on July 26, 2016

why not just use the ecg samples ? that would remove most sources of environmental noise that you have ?