|
Before applying the Hidden Markov Model or SVM to our movements, we need to extract features out of our 12 signal screaming signals. If we don’t, we get a mess (dododoudi lolalololadou doulu) because from the eye of the SVM machine, the signal is totally different from time to time, even though we can see it is just shifting. In order to take this “change in phase” property into account, we will try to use the information in the frequency domain. Movement features (Kinemes)![]() What makes a movement ? How can we extract something meaningful out of the signals on the right ? FrameWe should first divide our temporal data into frames (overlapping to avoid loosing information) with a group of sample. In each frame, relevant features could be:
These four elements (
Looking at the values, we can see that they repeat and should not be too hard to learn. The literature on gesture recognition gives great attention to the signal preprocessing part. They use fourier transmforms and vector quantization. The first element (fourier transforms) looks hard to implement for me. The second is Short-time fast Fourier transforms![]() picture © Alessio Damato Internet is a great place to dig for such informations. Here is an article explaining how this works. Wikipedia has an article too: short-time Fourier transform. A nice introduction to Fourier theory as applied to audio processing can be found here. On the picture you can see an example of a short-time fourier transform applied to a sinusoidal signal with frequencies changing every 5 second from 10Hz to 25Hz, 50Hz and 100Hz. The frequency changes are easily identified. The window used by STFT was 1 second. For those interested, here is the formula for the Fourier transform:
What this formula says basically is “for a given frequency f, for each time t, sum all the values of the signal when it is like the cosinus of period (2πf)”. If the signal is not like this cosinus, it’s positive and negative values will sum up to zero. If the signal is constant it will also sum up to zero. If we get a big value, the frequency has an important role in the signal. Note: we remove the imaginary part to show this period relationship, not because it is useless… FFT object![]() We talked about it and I just made it ! It uses the code from Laurent de Soras and was not too hard to implement (thanks!). It’s also fast enough for our 12 signals with a window of 256 samples. The picture on the left shows the raw output from FFT (without expressing the result in polar coordinates to show amplitude and phase separately which makes more sense for us). ![]() I As you can see, we have nice edges at the frequency at which I was shaking my right leg. We will further investigate to see if we can feed SVM with this data instead of the raw vector (maybe using the VQ object in between). Other stuffWaveletsAfter some research and basic understanding of FFT (see the relisoft article) I finally understood that FFT (and particularly short-time FFT) is a particular case of wavelet transforms. I found a good tutorial here (in french here). Hmmm, this is too much work for me right now. STOP. |
documents |