I had a lot of learning to do on something called unsupervised learning. I spent the first half of this week on that. To explain it I'll relate it to my past work. Previously, I was using supervised learning, because I was labeling the coughs and the algorithm could then reference those labels. You could say the labels were supervising the program. Now I am trying to write an algorithm called K-means where there are no labels. What this will do is take each cough as a data point on a graph then group the coughs together based on their characteristics. I'll put the equations of the algorithm and a graphical animation of what it does. It's a little confusing and hard to explain, so ask any questions you have.
Overall this was a good week. I got a pretty solid start to my new project, and learned a lot of new things. There are still a few questions that I am not sure how to solve, so I will tell you all about them next week. Until then ... Buh Bye.
After learning about K-means I went to get data. Dr. Berisha provided me with a link to a database of cough videos, which I'll give you guys here. I am still working on a way to use all of this data efficiently, so I'll let you guys know when I figure it out. Finally, I wrote a little bit of my own code, with Prad's help. I added on to the code I was using before, to get information on the length and frequency of coughs. This is a small step towards the overall program, but it was really exciting (and a little discouraging because I needed to learn more MATLAB syntax).
Overall this was a good week. I got a pretty solid start to my new project, and learned a lot of new things. There are still a few questions that I am not sure how to solve, so I will tell you all about them next week. Until then ... Buh Bye.
Hey Luke! This week definitely sounded like a productive one! What else do you need to learn about MATLAB syntax to help your project?
ReplyDeleteThere are a few commands that I can use in MATLAB that are very useful, but I have never used before. Commands like FFT (fast fourier transform) and PCA let me do analysis on the coughs with pretty much just one line of code, but I needed to learn what inputs to give these commands and what they output.
DeleteHello Luke. I'm glad you had a good week! Is unsupervised learning basically guessing values until they somewhat match?
ReplyDeleteKind of. This example of unsupervised learning is guessing values and grouping the coughs based on how close they are to those values. The coughs are grouped to the closest value(center).
DeleteHappy week 8 Luke! Almost week 10. It's very interesting how you have finally placed all the cough into a usable graph. I just don't really seem to understand how you will use it for research purposes. Can you give me a general explanation? Also Thanks for the link to all the coughs you are using! I liked the first one the best :3
ReplyDeleteCan't wait to see more!
The graph in this post is actually just an example of K-means, and has no relation to my research (I'll have one for you next week). But the idea of using this type of graph is to see if we can group coughs together based on their characteristics. This means that eventually we will be able to relate a group of coughs to a certain illness (hopefully).
DeleteWhat are the advantages of unsupervised versus supervised learning? As you said, unsupervised removes all the labels and plots each data point as its own among the whole data set, but what advantage does that bring? Is it not more advantageous to have a more accurate data table regarding each individual category? Sorry for the huge influx of questions, but I'm very intrigued to this topic. Can't believe it's week 8 already! I hope your leg feels better, by the way.
ReplyDeleteWe need to eventually move away from supervised learning. This is because, once the device is created, we can't label the coughs for it. This unsupervised learning may not ultimately be used, but it is basically to see if we can get results without labels. The goal is to have a pseudo unsupervised learning, where the algorithm I was working with before makes the labels itself accurately.
DeleteHi Luke. How will this coughing data and program relate to your main topic of possibly detecting diseases?
ReplyDeleteIf I can group the coughs together accurately, we might be able to relate a group of coughs, based on their characteristics, to a type of illness or disease.
DeleteHey Luke! Nice work. What are the pros of utilizing unsupervised learning?
ReplyDeleteIt is much more efficient than supervised learning. Instead of labeling every single cough by hand, the program can make an analysis on the coughs without the labels.
DeleteHey Luke! I'm curious. What are the other problems you can't figure out? This looks pretty tough and I can tell you took the time to understand it. Keep up the good work!
ReplyDeleteThe biggest problem is actually using the audio set I showed you guys. Each sample is 10 seconds long, which means it is very possible that there is more than one cough. If I am to do analysis on the samples I need to know where the coughs are, because the unsupervised learning I have isn't completely unsupervised yet. I also don't know how to download all of the samples, especially the fact that they are only 10 seconds of a video that could be 20 minutes long.
DeleteHi Luke! It's nice to see that you've achieved so much during your srp! I hope to see what happens with the K-Means algorithm.
ReplyDeleteI got it to work pretty well (I think). So I'll definitely talk about it this week.
DeleteHey Luke. Could you please elaborate more on K-means? Why is using no labels more useful than using labels? Having no labels seems a little disorganized.
ReplyDeleteK-means is to group the coughs together. In theory this can be done with no labels, but I am actually using data that is labeled to make it go smoother. The goal of K-means is to find groups of coughs based on their characteristics. In this sense the data is not labeled because I don't tell the program which cough has what characteristics or what group it is likely to be in.
DeleteHey Luke! It's great that this week you've switched things up a bit with the unsupervised learning! Also, that database of the cough videos was interesting, I didn't know that such a thing existed. Awesome post!
ReplyDeleteThanks Urmi, I'll have some more data and results of the unsupervised learning next week.
DeleteHi Luke! From looking at the picture with the algorithm, I can understand how a lot of this material is very difficult to understand. It's great to see that you are able to get results!
ReplyDelete