Friday, March 31, 2017

Week 8

And we're back. It's week 8; that means only 2 weeks left. As you heard last week, I started a new task this week, and hopefully I can finish it in the remaining time. So without wasting any time, lets talk about the beginning of the new project.

I had a lot of learning to do on something called unsupervised learning. I spent the first half of this week on that. To explain it I'll relate it to my past work. Previously, I was using supervised learning, because I was labeling the coughs and the algorithm could then reference those labels. You could say the labels were supervising the program. Now I am trying to write an algorithm called K-means where there are no labels. What this will do is take each cough as a data point on a graph then group the coughs together based on their characteristics. I'll put the equations of the algorithm and a graphical animation of what it does. It's a little confusing and hard to explain, so ask any questions you have.




After learning about K-means I went to get data. Dr. Berisha provided me with a link to a database of cough videos, which I'll give you guys here. I am still working on a way to use all of this data efficiently, so I'll let you guys know when I figure it out. Finally, I wrote a little bit of my own code, with Prad's help. I added on to the code I was using before, to get information on the length and frequency of coughs. This is a small step towards the overall program, but it was really exciting (and a little discouraging because I needed to learn more MATLAB syntax). 


Overall this was a good week. I got a pretty solid start to my new project, and learned a lot of new things. There are still a few questions that I am not sure how to solve, so I will tell you all about them next week. Until then ... Buh Bye.

Image result for adventure time bye gif

21 comments:

  1. Hey Luke! This week definitely sounded like a productive one! What else do you need to learn about MATLAB syntax to help your project?

    ReplyDelete
    Replies
    1. There are a few commands that I can use in MATLAB that are very useful, but I have never used before. Commands like FFT (fast fourier transform) and PCA let me do analysis on the coughs with pretty much just one line of code, but I needed to learn what inputs to give these commands and what they output.

      Delete
  2. Hello Luke. I'm glad you had a good week! Is unsupervised learning basically guessing values until they somewhat match?

    ReplyDelete
    Replies
    1. Kind of. This example of unsupervised learning is guessing values and grouping the coughs based on how close they are to those values. The coughs are grouped to the closest value(center).

      Delete
  3. Happy week 8 Luke! Almost week 10. It's very interesting how you have finally placed all the cough into a usable graph. I just don't really seem to understand how you will use it for research purposes. Can you give me a general explanation? Also Thanks for the link to all the coughs you are using! I liked the first one the best :3
    Can't wait to see more!

    ReplyDelete
    Replies
    1. The graph in this post is actually just an example of K-means, and has no relation to my research (I'll have one for you next week). But the idea of using this type of graph is to see if we can group coughs together based on their characteristics. This means that eventually we will be able to relate a group of coughs to a certain illness (hopefully).

      Delete
  4. What are the advantages of unsupervised versus supervised learning? As you said, unsupervised removes all the labels and plots each data point as its own among the whole data set, but what advantage does that bring? Is it not more advantageous to have a more accurate data table regarding each individual category? Sorry for the huge influx of questions, but I'm very intrigued to this topic. Can't believe it's week 8 already! I hope your leg feels better, by the way.

    ReplyDelete
    Replies
    1. We need to eventually move away from supervised learning. This is because, once the device is created, we can't label the coughs for it. This unsupervised learning may not ultimately be used, but it is basically to see if we can get results without labels. The goal is to have a pseudo unsupervised learning, where the algorithm I was working with before makes the labels itself accurately.

      Delete
  5. Hi Luke. How will this coughing data and program relate to your main topic of possibly detecting diseases?

    ReplyDelete
    Replies
    1. If I can group the coughs together accurately, we might be able to relate a group of coughs, based on their characteristics, to a type of illness or disease.

      Delete
  6. Hey Luke! Nice work. What are the pros of utilizing unsupervised learning?

    ReplyDelete
    Replies
    1. It is much more efficient than supervised learning. Instead of labeling every single cough by hand, the program can make an analysis on the coughs without the labels.

      Delete
  7. Hey Luke! I'm curious. What are the other problems you can't figure out? This looks pretty tough and I can tell you took the time to understand it. Keep up the good work!

    ReplyDelete
    Replies
    1. The biggest problem is actually using the audio set I showed you guys. Each sample is 10 seconds long, which means it is very possible that there is more than one cough. If I am to do analysis on the samples I need to know where the coughs are, because the unsupervised learning I have isn't completely unsupervised yet. I also don't know how to download all of the samples, especially the fact that they are only 10 seconds of a video that could be 20 minutes long.

      Delete
  8. Hi Luke! It's nice to see that you've achieved so much during your srp! I hope to see what happens with the K-Means algorithm.

    ReplyDelete
    Replies
    1. I got it to work pretty well (I think). So I'll definitely talk about it this week.

      Delete
  9. Hey Luke. Could you please elaborate more on K-means? Why is using no labels more useful than using labels? Having no labels seems a little disorganized.

    ReplyDelete
    Replies
    1. K-means is to group the coughs together. In theory this can be done with no labels, but I am actually using data that is labeled to make it go smoother. The goal of K-means is to find groups of coughs based on their characteristics. In this sense the data is not labeled because I don't tell the program which cough has what characteristics or what group it is likely to be in.

      Delete
  10. Hey Luke! It's great that this week you've switched things up a bit with the unsupervised learning! Also, that database of the cough videos was interesting, I didn't know that such a thing existed. Awesome post!

    ReplyDelete
    Replies
    1. Thanks Urmi, I'll have some more data and results of the unsupervised learning next week.

      Delete
  11. Hi Luke! From looking at the picture with the algorithm, I can understand how a lot of this material is very difficult to understand. It's great to see that you are able to get results!

    ReplyDelete