Saturday, April 8, 2017

Week 9

Hey everyone, done with the second to last week. Just counting down days until the presentation. As promised I have some results for my latest undertaking. The results actually turned out really well, but I am not sure to what extent they will be used.

Just to make sure you guys know what's going on, I'll explain it again. This program is an example of unsupervised learning. I took data that was already labeled (so it's supervised in a sense) and grouped the coughs based on their characteristics. What characteristics you ask? The easiest way to explain it is by just saying frequency, but I'll explain a little more. One thing that is pretty self explanatory is the length of the cough, but otherwise I took a Fourier transform of each cough which gives me the frequencies at multiple points in the cough, 256 of them to be exact.

After getting all that information on the 500 coughs of my sample, I had to do something called principal component approximation, which basically condenses all the information. I ended up with a 257x257 matrix to perform k-means on. In MATLAB there is actually a built in function for k-means, so most of the coding I did was to make the graph all pretty for you guys.


This is the graph that looks the best. I took two of the columns from the 257x257 matrix and graphed it. This is just the first two columns, but I also randomized the columns later, and got slightly uglier results. The way K-means works, the groups should be different every time, but this graph was pretty clear with the groups.

A lot of you asked last week, what's the importance of doing k-means, and I answered you briefly, but I thought this is a good chance to explain better. With the groups I created we can make a guess as to which coughs are similar to each other. This may not seem too important right now, but when we merge samples from all types of people possibly with diagnosed illnesses we can start relating the coughs to an illness.

Well that's it for this week. I've made some good progress making my final presentation and I still have some more stuff to do in this last week, so look forward to that. I know a lot of you have finals, mocks and APs coming up, so good luck, and ... Buh Bye.

20 comments:

  1. Hey Luke! Sounds like you've been figuring out a lot! What did you mean when you said that the randomization of the graph gave you uglier results, and why did you randomize it?

    ReplyDelete
    Replies
    1. The clusters/groups weren't as clear as the graph above. I randomized it because I wanted to see to results from different principle components just to get full coverage of the data.

      Delete
  2. Heya Luke. It's amazing to see how far your project has developed and fine tuned the past 9 weeks. In the last week, are you planning on relating the cough to be able to determine whether someone has specific diseases or not?

    ReplyDelete
    Replies
    1. I wish I could. That task is pretty difficult to tackle, so I doubt I'll be able to, even if I keep working after the end of the project.

      Delete
  3. Hey Luke! This is all really cool, and I am wondering if you are going to be following the research well after your own research is done? Keep up the good work.

    ReplyDelete
    Replies
    1. I'd like to, but I am not sure how involved I will be able to be.

      Delete
  4. Hi Luke. Nice graphs and thank you for explaining k-means and how identifying the coughs with a certain disease could maybe help diagnosis. Keep working hard for your presentation!

    ReplyDelete
  5. Hi Luke. It's really awesome to see you implementing the things you discussed last week. Did you encounter any big issues? Good luck on your final week and preparation for your presentation!

    ReplyDelete
    Replies
    1. Not too many issues actually. MATLAB was pretty good to work with on this type of data analysis, because it had all these functions built in.

      Delete
  6. Hey Luke. Could you elaborate on Fourier transforms? Additionally, by saying that one graph looks the "best" or "uglier," what quality of the graph are you referring to? (r-squared, correlation, etc.) Good luck on your presentation, and I hope I'm able to attend!

    ReplyDelete
    Replies
    1. The graphs were uglier because there were no clear groups of points like this one.

      As for fourier transforms, it is a little difficult to elaborate on them (I'm having a little trouble for my presentation), but I can try o explain them next time I see you.

      Delete
  7. Hey Luke! Nice graphs! What types of illnesses are you planning on being able to relate to the coughs? Best of luck with the presentation.

    ReplyDelete
    Replies
    1. I won't be able to work on that part of the project, but I imagine it will be most effective with many lung diseases.

      Delete
  8. Hi Luke! It looks like you've got a lot done this past week. I really love your clear graphs above with the clustering( I've been trying to tinker with R a bit to get pretty graphs too).

    ReplyDelete
    Replies
    1. Thanks Nichole, I've noticed your graphs too, they're looking pretty good.

      Delete
  9. Hey Luke! It seems like you've been doing a lot of data and analysis in the past week. It's exciting to see your project all come together after reading about it in the beginning. Did anything happen during the past couple months that you didn't expect?

    ReplyDelete
    Replies
    1. The most unexpected thing was actually starting work on the cough project. I wasn't even aware of this project when I went to the lab for the first time.

      Delete
  10. Hi Luke! It's wonderful to see you finally get some data to analyze, meaning you're getting results! Do you think that the next week will give you enough time to sift through everything?

    ReplyDelete
    Replies
    1. It did not, I spent most of my time preparing for my presentation, so I wasn't able to work as much with the data.

      Delete