Do You Hear That?

Wednesday, August 30, 2017

Luke Forsman

That's right, you did hear that. Welcome to my Senior Research Project blog.

My name is Luke Forsman; I am a senior at BASIS Scottsdale. My project will be starting in about a week and I am very excited to share my research and interests with you.

Outside of school, I am involved in both the arts and sports. I am on the track team, running the 800m and 4x800 and doing long and triple jump. I also played soccer on the school team. My interests that relate to this project, however, stem from the arts. I love both looking at and creating the fine arts, especially sculpture. Branching off that is my interest in music, particularly alternative, indie, and electronic music.

In school, my interests lie in physics and mathematics. Along with those, I have been growing my knowledge of computer science and electrical engineering. For college I will be attending Cornell University majoring in physics and I hope to be involved in my other interests as well. But I guess I'll have to get used to the snow first.

My research project will be at ASUs department of speech and hearing working with Dr. Visar Berisha in his lab, where he is researching a link between speech patterns and diseases. My role in the lab is still undecided, but I am still eager to be a part of this research. Outside of the lab I will be auditing Dr. Berisha’s class on signal processing to further understand his research. This research excites me because of how interdisciplinary it is. Using sound patterns and frequencies to detect diseases plays into my interests in physics, music, and computer science, while also broadening my views to a medical field.

Thank you for tuning in to this first blog post and hopefully you will continue to follow my research with this blog. On the right you will find links to my official proposal, the Department of Speech and Hearing at ASU, and more information on the senior projects. You will also find other great blogs in my blog squad.

I'll keep you guy posted as I begin research and discover what I'm going to be doing in the lab. Until then,

Friday, April 14, 2017

Week 10

Hey everyone, it's the last week. This past week has been a lot of preparing for my presentation on May 6th, so I don't have much to show you. So I'll take this time to say: come to my presentation at 3:05-3:20.

So what have I been doing for my presentation? For the most part I have been trying to find ways to explain my research in a straightforward interesting way. That means finding a lot of pictures (in some cases making my own) and planning exactly what I am going to say. This doesn't lend to a very interesting blog post, but I do have one interesting thing that I have to do for the presentation.

I am trying to make the coughs/errors found with the earlier algorithm playable. That means going through all of the code and adding a few lines to take numbers from the error matrix and apply them to the sound sample. So far this has proved pretty difficult, and I haven't actually finished it yet, but I thought it would be good to tell you about my coding.

Well that's all for this week, hopefully I'll see you at the presentation, so until then...Buh Bye.

Saturday, April 8, 2017

Week 9

Hey everyone, done with the second to last week. Just counting down days until the presentation. As promised I have some results for my latest undertaking. The results actually turned out really well, but I am not sure to what extent they will be used.

Just to make sure you guys know what's going on, I'll explain it again. This program is an example of unsupervised learning. I took data that was already labeled (so it's supervised in a sense) and grouped the coughs based on their characteristics. What characteristics you ask? The easiest way to explain it is by just saying frequency, but I'll explain a little more. One thing that is pretty self explanatory is the length of the cough, but otherwise I took a Fourier transform of each cough which gives me the frequencies at multiple points in the cough, 256 of them to be exact.

After getting all that information on the 500 coughs of my sample, I had to do something called principal component approximation, which basically condenses all the information. I ended up with a 257x257 matrix to perform k-means on. In MATLAB there is actually a built in function for k-means, so most of the coding I did was to make the graph all pretty for you guys.

This is the graph that looks the best. I took two of the columns from the 257x257 matrix and graphed it. This is just the first two columns, but I also randomized the columns later, and got slightly uglier results. The way K-means works, the groups should be different every time, but this graph was pretty clear with the groups.

A lot of you asked last week, what's the importance of doing k-means, and I answered you briefly, but I thought this is a good chance to explain better. With the groups I created we can make a guess as to which coughs are similar to each other. This may not seem too important right now, but when we merge samples from all types of people possibly with diagnosed illnesses we can start relating the coughs to an illness.

Well that's it for this week. I've made some good progress making my final presentation and I still have some more stuff to do in this last week, so look forward to that. I know a lot of you have finals, mocks and APs coming up, so good luck, and ... Buh Bye.

Friday, March 31, 2017

Week 8

And we're back. It's week 8; that means only 2 weeks left. As you heard last week, I started a new task this week, and hopefully I can finish it in the remaining time. So without wasting any time, lets talk about the beginning of the new project.

I had a lot of learning to do on something called unsupervised learning. I spent the first half of this week on that. To explain it I'll relate it to my past work. Previously, I was using supervised learning, because I was labeling the coughs and the algorithm could then reference those labels. You could say the labels were supervising the program. Now I am trying to write an algorithm called K-means where there are no labels. What this will do is take each cough as a data point on a graph then group the coughs together based on their characteristics. I'll put the equations of the algorithm and a graphical animation of what it does. It's a little confusing and hard to explain, so ask any questions you have.

After learning about K-means I went to get data. Dr. Berisha provided me with a link to a database of cough videos, which I'll give you guys here. I am still working on a way to use all of this data efficiently, so I'll let you guys know when I figure it out. Finally, I wrote a little bit of my own code, with Prad's help. I added on to the code I was using before, to get information on the length and frequency of coughs. This is a small step towards the overall program, but it was really exciting (and a little discouraging because I needed to learn more MATLAB syntax).

Overall this was a good week. I got a pretty solid start to my new project, and learned a lot of new things. There are still a few questions that I am not sure how to solve, so I will tell you all about them next week. Until then ... Buh Bye.

Friday, March 24, 2017

Week 7

Hey guys, we're already on week seven. This week, unlike last week, went pretty smoothly, with all of the issues I ran into and solved last week. The theme of week 7 for me is data; I spent the majority of the time searching for and using new data in the algorithm.

Over the weekend I found a few short videos of people coughing, one from a news blooper, one a montage of coughs from a video game streamer, and one of someone coughing at a piano recital. I then spent the week doing the same thing I did with the Hilary Clinton sample. I converted each one from stereo to mono, labeled all the coughs, converted the text to a csv file, and ran them in the algorithm. After doing all of this I got some pretty good data to compare to the Clinton.

Here is a summary. When I get results, the main thing I look at is the errors. They come in a long list of numbers representing every place that the algorithm made a mistake. If the number is greater than the number of coughs in the sample it is a false positive, and if it is less than or equal to the number of coughs it is a false negative.

In this table, you can see that the Clinton and Piano samples have a lot of false negatives. All three of these examples had false positives, but the Clinton and Mailtis (the name of the news reporter) samples had more. Why did I get these results though. Looking at the Clinton and Piano samples I can relate them in that they had background noise; Clinton's had cheering fans and the Piano had a piano. This could be the reason for false negatives. I am not really sure about the false positives though, and as of now I am not able to look at the audio that the algorithm thought was a cough.

You might be wondering where the video game sample went. There was a slight problem with this one, because there was a cough at the very end of the audio. The algorithm looks at the times before and after the cough, so when there was nothing after it gave an error. I fixed this by adding extra audio to the end, and will have those results soon.

After going through all of this, I got a new task/project. I will be trying to make a new program to tell the difference between types of coughs. This is particularly exciting because no one has tried to solve this, so I will be able to do my own programming from scratch, hopefully.

Well that's all for this week. It was a good week, but not a super interesting for you guys, because I was just looking at results. Hopefully next week will bring a bunch of new things to talk about, so see you then... Buh Bye.

Friday, March 17, 2017

Week 6

Welcome back everybody, I hope you had a good break and week back from break. For me I was able to relax a lot during the break, but, when I came back, I had a lot of work to do.

Here's a recap of my week four post, because my work this week works directly with it. In week four I made a start to contributing to the cough project. I collected data from a video of Hilary Clinton coughing, converted the audio from stereo to mono, and labeled every cough in the sample. With this I had a text file of the start and stop time of every cough, and the wav file (audio) of the sample.

When I came into this week I wanted to have used that data in an algorithm Prad provided to me in MATLAB and have some results. But I ran into some problems. First of all, my version of MATLAB did not have a signal processing add-on, which is kind of important considering the subject I am working on. So my first few attempts failed until I downloaded that add-on.

Next I had to actually apply my data to the algorithm. I decided to test the algorithm on my data, meaning once it had learned what a cough is from other data, it would listen to the audio of my data and compare its results to my labels. To do this I ran into some more issues of inputting my data. I had to make sure the program could access the data, so I defined the path to find it (basically telling it to go to a certain folder). Another issue was that I hadn't converted my files to the right format. The program read data in csv (comma-separated values) format instead of text format, so I needed to reformat. It also needed the audio files... silly me didn't realize that the program needed the audio, but it obviously does if it is testing on it...duh.

Well, after all of my silly problems, I finally got results, and they were actually useful to the cough project as a whole. There were a lot of false negatives, or instances where the program didn't think there was a cough, but there actually was. This is helpful, because we can see what these coughs looked like to trip up the algorithm and then adjust so it doesn't mess up again. Now I am in search of more data to learn even more. So if you know of any videos with a lot of coughing or have a sick friend/family member you can record that'd be awesome. :)

It was a rough week for me, with a lot of fine tuning. I realized just how much work has gone into this project already and how hard it is for me to just jump in and try to help. But on the bright side the week is over and I am contributing somewhat. Oh... and just so you know, I got my test back from before the break and I didn't fail, so I guess I am qualified to explain Fourier to you.

That's it for this week, going forward I hope to have less problematic weeks and posts, so until next time... Buh Bye.

Friday, March 10, 2017

Week 5

Hey guys, this week was my spring break, so I don't have anything to update you on. I will have stuff for you next week though, so see you then.

Pages