Just to make sure you guys know what's going on, I'll explain it again. This program is an example of unsupervised learning. I took data that was already labeled (so it's supervised in a sense) and grouped the coughs based on their characteristics. What characteristics you ask? The easiest way to explain it is by just saying frequency, but I'll explain a little more. One thing that is pretty self explanatory is the length of the cough, but otherwise I took a Fourier transform of each cough which gives me the frequencies at multiple points in the cough, 256 of them to be exact.
After getting all that information on the 500 coughs of my sample, I had to do something called principal component approximation, which basically condenses all the information. I ended up with a 257x257 matrix to perform k-means on. In MATLAB there is actually a built in function for k-means, so most of the coding I did was to make the graph all pretty for you guys.
A lot of you asked last week, what's the importance of doing k-means, and I answered you briefly, but I thought this is a good chance to explain better. With the groups I created we can make a guess as to which coughs are similar to each other. This may not seem too important right now, but when we merge samples from all types of people possibly with diagnosed illnesses we can start relating the coughs to an illness.
Well that's it for this week. I've made some good progress making my final presentation and I still have some more stuff to do in this last week, so look forward to that. I know a lot of you have finals, mocks and APs coming up, so good luck, and ... Buh Bye.
Hey Luke! Sounds like you've been figuring out a lot! What did you mean when you said that the randomization of the graph gave you uglier results, and why did you randomize it?
ReplyDeleteThe clusters/groups weren't as clear as the graph above. I randomized it because I wanted to see to results from different principle components just to get full coverage of the data.
DeleteHeya Luke. It's amazing to see how far your project has developed and fine tuned the past 9 weeks. In the last week, are you planning on relating the cough to be able to determine whether someone has specific diseases or not?
ReplyDeleteI wish I could. That task is pretty difficult to tackle, so I doubt I'll be able to, even if I keep working after the end of the project.
DeleteHey Luke! This is all really cool, and I am wondering if you are going to be following the research well after your own research is done? Keep up the good work.
ReplyDeleteI'd like to, but I am not sure how involved I will be able to be.
DeleteHi Luke. Nice graphs and thank you for explaining k-means and how identifying the coughs with a certain disease could maybe help diagnosis. Keep working hard for your presentation!
ReplyDeleteHi Luke. It's really awesome to see you implementing the things you discussed last week. Did you encounter any big issues? Good luck on your final week and preparation for your presentation!
ReplyDeleteNot too many issues actually. MATLAB was pretty good to work with on this type of data analysis, because it had all these functions built in.
DeleteHey Luke. Could you elaborate on Fourier transforms? Additionally, by saying that one graph looks the "best" or "uglier," what quality of the graph are you referring to? (r-squared, correlation, etc.) Good luck on your presentation, and I hope I'm able to attend!
ReplyDeleteThe graphs were uglier because there were no clear groups of points like this one.
DeleteAs for fourier transforms, it is a little difficult to elaborate on them (I'm having a little trouble for my presentation), but I can try o explain them next time I see you.
Hey Luke! Nice graphs! What types of illnesses are you planning on being able to relate to the coughs? Best of luck with the presentation.
ReplyDeleteI won't be able to work on that part of the project, but I imagine it will be most effective with many lung diseases.
DeleteHi Luke! It looks like you've got a lot done this past week. I really love your clear graphs above with the clustering( I've been trying to tinker with R a bit to get pretty graphs too).
ReplyDeleteThanks Nichole, I've noticed your graphs too, they're looking pretty good.
DeleteHey Luke! It seems like you've been doing a lot of data and analysis in the past week. It's exciting to see your project all come together after reading about it in the beginning. Did anything happen during the past couple months that you didn't expect?
ReplyDeleteThe most unexpected thing was actually starting work on the cough project. I wasn't even aware of this project when I went to the lab for the first time.
DeleteHi Luke! It's wonderful to see you finally get some data to analyze, meaning you're getting results! Do you think that the next week will give you enough time to sift through everything?
ReplyDeleteIt did not, I spent most of my time preparing for my presentation, so I wasn't able to work as much with the data.
DeleteAwesome post. Good Post. I like your blog. You Post is very informative. Thanks for Sharing.
ReplyDeleteSCADA Training in Noida
PLC SCADA Institute in Noida
RPA Training in Noida
Machine Learning with Python Training in Noida
Hadoop Training in Noida
Informatica Training in Noida
R Programming Training in Noida