Here's a recap of my week four post, because my work this week works directly with it. In week four I made a start to contributing to the cough project. I collected data from a video of Hilary Clinton coughing, converted the audio from stereo to mono, and labeled every cough in the sample. With this I had a text file of the start and stop time of every cough, and the wav file (audio) of the sample.
When I came into this week I wanted to have used that data in an algorithm Prad provided to me in MATLAB and have some results. But I ran into some problems. First of all, my version of MATLAB did not have a signal processing add-on, which is kind of important considering the subject I am working on. So my first few attempts failed until I downloaded that add-on.
Next I had to actually apply my data to the algorithm. I decided to test the algorithm on my data, meaning once it had learned what a cough is from other data, it would listen to the audio of my data and compare its results to my labels. To do this I ran into some more issues of inputting my data. I had to make sure the program could access the data, so I defined the path to find it (basically telling it to go to a certain folder). Another issue was that I hadn't converted my files to the right format. The program read data in csv (comma-separated values) format instead of text format, so I needed to reformat. It also needed the audio files... silly me didn't realize that the program needed the audio, but it obviously does if it is testing on it...duh.
It was a rough week for me, with a lot of fine tuning. I realized just how much work has gone into this project already and how hard it is for me to just jump in and try to help. But on the bright side the week is over and I am contributing somewhat. Oh... and just so you know, I got my test back from before the break and I didn't fail, so I guess I am qualified to explain Fourier to you.
Fascinating post! Currently, are you just looking for videos to help make the algorithm more accurate? Are there times when you hear a cough but someone else in the lab does not? How do you account for these differences, or is that necessary given that you have an algorithm that will eventually be able to recognize the difference?
ReplyDeleteYeah, the more videos we use the more accurate the algorithm will be. In most cases coughs are accepted by everyone, but there are times where it is unclear when the cough ends.
DeleteOne example is when someone clears their throat immediately after the cough, and it is unclear if that should be included or not. In that case it was decided, before I got here, to not include the throat clearing. If something else weird comes up we will meet and discuss what to do with it.
Glad you solved the technical issues in the end! If you are given false negatives, what do you do to fix such problems, such as not detecting coughs like you stated? Also, glad you had a great spring break! :)
ReplyDeleteThere are a couple things that we might do. One is change the algorithm to account for the different type of cough. This is really complicated and I haven't seen an example of it. Another option is to train the algorithm on that different cough. In theory the algorithm should be able to change itself to account for the difference if it is trained with it. This second way is more likely.
DeleteHello Luke. It seems like your work this week has been quite tedious, but I'm glad you got through it. Approximately how much data do you need to improve this, or any other, machine learning algorithm to a sufficient level? (Unfortunately, I do not have or know of any cough recordings) Also, congratulations on not failing your test!
ReplyDeleteUnfortunately I am not sure of an exact amount of data to make it accurate to a sufficient level. All I can say is that as long as the algorithm comes up with false negatives or positives more data is probably needed.
DeleteHey Luke! I'm glad you got through that process, as it seemed a rather meticulous and difficult process. What other algorithms and data are useful to your project? You have no obligation to go into detail, but names and categories would help. Hope you enjoyed your break!
ReplyDeleteAs far as more algorithms that are useful, there shouldn't be any, because the machine learning algorithm hopefully adjusts itself for all types of coughs. For more data it is useful to have all types of data, meaning different types of coughs. This could be coughs linked to different diseases or something similar.
DeleteHey! This is all really cool and I was wondering whether or not you actually will be modifying the algorithm to accommodate for the mishaps such as false positives and such? Keep up the good work!
ReplyDeleteHopefully we won't have to, the goal is that we can train the machine learning to accommodate for the mishaps itself. If that doesn't happen then we will have to modify it by hand. As far as me being a part of that process, I may be able to observe or give suggestions, but the code is far to complicated for me (with limited experience) to do it myself, sadly.
DeleteHi Luke! Congratulations on passing your test! So for the cough project, how would the final algorithm aid people who are coughing?
ReplyDeleteThe final algorithm will have one more feature, the ability to classify coughs based on their characteristics. This will help people who are coughing by telling them what kind of illness they have or the severity of their cough.
DeleteHey Luke! Last week sounds a lot more difficult for you than the rest of the weeks did, but I'm glad that you got results in the end! Also, congrats on not failing that college test, I'm looking forward to the next post!
ReplyDeleteThanks Urmi. Yeah this past week was difficult because I ran into a lot of things I didn't know at all, but because i had a rough week last week, week 7 is going pretty smoothly.
DeleteHi Luke! Catching up on a project that's been in progress for so long is extremely hard to do because of all the material there is. Would you say measuring voice patterns is a reliable way of detecting diseases, knowing as much as you do now?
ReplyDeleteActually yeah. I have seen some of the labs other works that are pretty much complete, and their results are pretty cool. One example (that I'm not sure I can talk about) is a study on Muhammad Ali. They looked at his speech early and late in his career and saw a change that was a sign for Parkinson's disease, which he is diagnosed with now.
DeleteHi Luke! I guess no one can't back into things after spring break. What do you intend to do about the false negatives? Were all of instances false negatives, or were some correct?
ReplyDeleteHopefully we can train the algorithm to correct the false negatives. There were also still a good amount of coughs that were correct.
DeleteHey Luke! Are there any factors that could affect the results of the algorithm, such as gender? Can't wait for next week's post.
ReplyDeleteYeah, there are different factors to consider with the algorithm. The most common one is noise, having a lot of sound in the background will make it more difficult to be accurate. Gender could also be a factor, because the coughs of a man may sound different from the coughs of a woman.
Delete