<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Asymptotic Labs (Posts about wavelets)</title><link>http://asymptoticlabs.com/</link><description></description><atom:link href="http://asymptoticlabs.com/categories/wavelets.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2022 &lt;a href="mailto:quidditymaster@gmail.com"&gt;Tim Anderton&lt;/a&gt; </copyright><lastBuildDate>Wed, 31 Aug 2022 21:28:19 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Wavelet Spectrograms for Speech Recognition</title><link>http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html</link><dc:creator>Tim Anderton</dc:creator><description>&lt;div tabindex="-1" id="notebook" class="border-box-sizing"&gt;
    &lt;div class="container" id="notebook-container"&gt;

&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;
&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;h2 id="Wavelet-Features-For-Speech-Recognition."&gt;Wavelet Features For Speech Recognition.&lt;a class="anchor-link" href="http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html#Wavelet-Features-For-Speech-Recognition."&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I've been partipating in the TensorFlow speech recognition challenge.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.kaggle.com/c/tensorflow-speech-recognition-challenge"&gt;https://www.kaggle.com/c/tensorflow-speech-recognition-challenge&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;It seems like the most common approach is to begin by turning the audio into a spectrogram and then feeding that into a 2D CNN. One trouble with spectrograms is that you have to trade off resolution in frequency for resolution in time and vice versa. In principle you can get higher resolution in time for higher frequencies than you can for lower frequencies but when you pick an input length for your short time fourier transform you lose temporal resolution much below the window length.&lt;/p&gt;
&lt;p&gt;Wavelets are one possible way around this limitation &lt;/p&gt;&lt;p&gt;&lt;a href="http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html"&gt;Read more…&lt;/a&gt; (17 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;
</description><category>audio</category><category>kaggle</category><category>machine learning</category><category>neural networks</category><category>speech recognition</category><category>wavelets</category><guid>http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html</guid><pubDate>Mon, 08 Jan 2018 07:00:00 GMT</pubDate></item></channel></rss>