Hacker News new | past | comments | ask | show | jobs | submit login
DeepHeart: A Neural Network for Predicting Cardiac Health (github.com/jisaacso)
149 points by jisaacso on July 25, 2016 | hide | past | favorite | 21 comments



Applying convolutional nets to one dimensional time series data is a fairly common problem in medicine, and the techniques here can certainly transfer to other medical domains. Besides the plethora of opportunities in one dimensional signals (ECG, PCG, PPG, EEG, etc), many medical problems pose similar challenges to machine learning techniques. For example:

* Massive class imbalance -- we have much more data for healthy patients than for sick patients, especially at the scales required for deep learning.

* Heavy amounts of noise -- Medical imaging is a difficult feat and noise is a fact of life. Not only that, but data is hard to come by and some noise in the labels is likely.

* Long term outcomes -- trying to predict diseases or other long term results from signals is very difficult, especially when the outcome is not immediately obvious by looking at the input.

TLDR: Imagine that you have to compete in ImageNet, but the images are full of noise, half of the ones labeled "dog" are actually some sort of bird, and instead of guessing what's in the image, you're given an image and then asked to guess if that image, when painted in watercolor, will make a baby smile.


The ML community has techniques for building methods to address low signal to noise, class imbalance, noisy labels and smaller sample sizes. I definitely agree, all of these are issues in medical datasets. Part of the exciting challenge at the intersection of medicine and machine learning is around scaling data collection while respecting patient privacy.


Agreed. The dearth of solutions for 1D time series modelling in deep learning is surprising.


It would have been interesting if it were able to do prediction based off a mobile phone sensor (like the heart rate apps).


I definitely agree! The challenge is collecting a large number of heart recordings from mobile phones, along with professionally diagnosed abnormalities. That's one reason this physionet dataset is so valuable.


ICYMI: cardiogram is doing this with the apple/android watch (possibly other devices). they're not using phone sensor data yet to my knowledge though...


Thanks for the shout-out! We posted a few example normal and abnormal heart rhythms from Apple Watch here: https://blog.cardiogr.am/what-do-normal-and-abnormal-heart-r...

Joe: this is really cool work! Happy to chat if you'd like: brandon@cardiogr.am.


I've seen you post before and I think you guys are doing really interesting work.

[edit - noted attribution to wrong author, after a long day!]


I know the feeling. :) Thanks for the comment and the earlier warning.


People may also be interested in the Framingham Heart Study[1] which among other things generated the following calculator:

http://cvdrisk.nhlbi.nih.gov/calculator.asp

[1]: http://www.framinghamheartstudy.org/


I know I may be asking to go down a rabbit hole, but:

(1) Does the NN use a deep architecture?

(2) How does the CNN performance compare with other algorithms?

(3) Have you looked at how performance varies with different features, frequency cut-offs, etc.?


Hey thanks for the great questions!

(1) The NN uses two convolutional units and a fully connected softmax layer. Relative to Inception V3 or highway networks, this is _not_ a very deep architecture. I was looking for a balance between accuracy and training time (trained on my MBP).

(2) I looked into other neural architectures (LSTM and several fully connected) without much difference in performance. If you take a look at the physionet google group, there are a number of other methods evaluated (logistic regression, SVMs, etc.)

(3) I did vary hyperparameters and saw a dropout of ~0.45 and frequency cut-off of 4Hz performed the best for this specific architecture. That said, I imagine the best performing features would be a concatenation of the output from several filters across a range of thresholds. Then the burden of deciding feature importance falls onto the learner.

http://arxiv.org/pdf/1507.06228v2.pdf http://physionet.org/challenge/2016/#forum


Check ST elevation/ST depression in https://en.wikipedia.org/wiki/ST_segment


This is a great idea. Well done!

Can't wait we figure out a way to collect these kind of signals in a scalable way.

Technology is great, but data is king.


Thanks! I definitely agree that collecting these signals is difficult to scale.


How did you find this data, or this competition? I wouldn't mind trying my hand at this, either.


All of the data can be found on the physionet site http://physionet.org/physiobank/database/challenge/2016/


this reminds me of http://rindexmedical.com and they have some device like this or some patent...


I hate to be a jerk, but... what is the point of this entire competition? Physionet audio recordings of heart murmurs? Stethoscopes are on their way out.

Cool model, I guess. Does it get us anywhere new? I don't know.


Take a look at the challenge homepage, http://physionet.org/challenge/2016/

tl;dr ECG is much more accurate but requires a careful, controlled environment. If it's possible to accurately predict heart abnormalities from sound, cheaper, easier to run screening tools can be developed for at-home use.


why not just use the ecg samples ? that would remove most sources of environmental noise that you have ?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: