<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Asymptotic Labs (Posts about machine learning)</title><link>http://asymptoticlabs.com/</link><description></description><atom:link href="http://asymptoticlabs.com/categories/machine-learning.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2022 &lt;a href="mailto:quidditymaster@gmail.com"&gt;Tim Anderton&lt;/a&gt; </copyright><lastBuildDate>Wed, 31 Aug 2022 21:28:49 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>SKM Embedding of MNIST</title><link>http://asymptoticlabs.com/posts/skm-embedding-of-mnist.html</link><dc:creator>Tim Anderton</dc:creator><description>&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;p&gt;I recently thought up a machine learning algorithm called &lt;a href="http://asymptoticlabs.com/asymptoticlabs.com/blog/posts/smooth-kernel-machines.html"&gt;"Smooth Kernel Macines" (SKM)&lt;/a&gt;.  In this post I will try out SKM on the ever ubiquitous MNIST dataset. The goal of this post is not so much to achieve state of the art performance on MNIST (though that would be nice), as it is to simply try out SKM on a familiar and well understood dataset.&lt;/p&gt;
&lt;p&gt;tldr; I achieve a respectable 0.006 error rate using an SKM type layer on top of a convolutional neural net feature extractor. An SKM output layer works a little better than a K way softmax (at least for MNIST). SKM trains faster, and comes with an accurate built in measure of prediction confidence.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://asymptoticlabs.com/posts/skm-embedding-of-mnist.html"&gt;Read more…&lt;/a&gt; (26 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;
</description><category>embeddings</category><category>machine learning</category><category>MNIST</category><category>SKM</category><guid>http://asymptoticlabs.com/posts/skm-embedding-of-mnist.html</guid><pubDate>Fri, 06 Apr 2018 06:00:00 GMT</pubDate></item><item><title>Wavelet Spectrograms for Speech Recognition</title><link>http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html</link><dc:creator>Tim Anderton</dc:creator><description>&lt;div tabindex="-1" id="notebook" class="border-box-sizing"&gt;
    &lt;div class="container" id="notebook-container"&gt;

&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;
&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;h2 id="Wavelet-Features-For-Speech-Recognition."&gt;Wavelet Features For Speech Recognition.&lt;a class="anchor-link" href="http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html#Wavelet-Features-For-Speech-Recognition."&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I've been partipating in the TensorFlow speech recognition challenge.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.kaggle.com/c/tensorflow-speech-recognition-challenge"&gt;https://www.kaggle.com/c/tensorflow-speech-recognition-challenge&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;It seems like the most common approach is to begin by turning the audio into a spectrogram and then feeding that into a 2D CNN. One trouble with spectrograms is that you have to trade off resolution in frequency for resolution in time and vice versa. In principle you can get higher resolution in time for higher frequencies than you can for lower frequencies but when you pick an input length for your short time fourier transform you lose temporal resolution much below the window length.&lt;/p&gt;
&lt;p&gt;Wavelets are one possible way around this limitation &lt;/p&gt;&lt;p&gt;&lt;a href="http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html"&gt;Read more…&lt;/a&gt; (17 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;
</description><category>audio</category><category>kaggle</category><category>machine learning</category><category>neural networks</category><category>speech recognition</category><category>wavelets</category><guid>http://asymptoticlabs.com/posts/waveletSpectrogramsTFSR.html</guid><pubDate>Mon, 08 Jan 2018 07:00:00 GMT</pubDate></item><item><title>Low Rank Approximation On Sparsely Observed Data</title><link>http://asymptoticlabs.com/posts/slra_sparse_obs.html</link><dc:creator>Tim Anderton</dc:creator><description>&lt;div tabindex="-1" id="notebook" class="border-box-sizing"&gt;
    &lt;div class="container" id="notebook-container"&gt;

&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;
&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;h2 id="Intermezzo:-Sparsely-Observed-Data"&gt;Intermezzo: Sparsely Observed Data&lt;a class="anchor-link" href="http://asymptoticlabs.com/posts/slra_sparse_obs.html#Intermezzo:-Sparsely-Observed-Data"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;In the post on &lt;a href="http://asymptoticlabs.com/posts/other_use_for_PCA_part2.html"&gt;using PCA for data imputation&lt;/a&gt; we used a weight for each of our data points. By assigning a weight of 0 to missing data and a weight of 1 to the rest of our data we managed to be able to get a reasonably good approximation to what we would find using PCA on the dataset without any data missing.&lt;/p&gt;
&lt;p&gt;This is fine when evaluating a dense model for our data matrix is not too much computational overhead. However when our input data are sparsely observed, that is to say most of our data consists of missing values then evaluating the model densely is a tremendous waste of computational resources. &lt;/p&gt;&lt;p&gt;&lt;a href="http://asymptoticlabs.com/posts/slra_sparse_obs.html"&gt;Read more…&lt;/a&gt; (23 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;
</description><category>machine learning</category><category>mathjax</category><category>PCA</category><category>reccommender systems</category><category>sparsity</category><guid>http://asymptoticlabs.com/posts/slra_sparse_obs.html</guid><pubDate>Thu, 26 Oct 2017 06:00:00 GMT</pubDate></item><item><title>Learning TensorFlow via a 3D printing Project</title><link>http://asymptoticlabs.com/posts/TensorFlowSLCPY.html</link><dc:creator>Tim Anderton</dc:creator><description>&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;p&gt;This is a slightly cleaned up version of the slides for a presentation I gave at the SLCPy meetup a while ago. I intended to write something with nice prose and turn it into something that can stand on its own without the verbal commentary that went along with it, but that isn't going to happen so here are the raw slides.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://asymptoticlabs.com/posts/TensorFlowSLCPY.html"&gt;Read more…&lt;/a&gt; (17 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;
</description><category>3D printing</category><category>awesomeness</category><category>gradient descent</category><category>machine learning</category><category>mathjax</category><category>TensorFlow</category><guid>http://asymptoticlabs.com/posts/TensorFlowSLCPY.html</guid><pubDate>Thu, 07 Sep 2017 06:00:00 GMT</pubDate></item><item><title>Lattice SVM</title><link>http://asymptoticlabs.com/posts/lattice_svm.html</link><dc:creator>Tim Anderton</dc:creator><description>&lt;div tabindex="-1" id="notebook" class="border-box-sizing"&gt;
    &lt;div class="container" id="notebook-container"&gt;

&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;
&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;h2 id="Lattice-SVM"&gt;Lattice SVM&lt;a class="anchor-link" href="http://asymptoticlabs.com/posts/lattice_svm.html#Lattice-SVM"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;A support vector machine (SVM) is a classifier that attempts to find a maximum margin linear separator for different classes in a very high dimensional implicit feature space. The feature space is usually not explicitly calculated but is instead accessed via a kernel function which provides the effective dot product in the feature space, this has the advantage that we can deal with very large implicit feature spaces this way. In fact the dimensionality of the implicit feature space of most commonly used SVM variants is usually quoted as being infinite, for example the Gaussian kernel is one example. But the high effective feature dimensionality still comes with a high computational cost, we must somehow deal with an N by N matrix of similarities relating all of our training points to each other (the matrix of kernelized "feature dot products").&lt;/p&gt;
&lt;p&gt;&lt;a href="http://asymptoticlabs.com/posts/lattice_svm.html"&gt;Read more…&lt;/a&gt; (18 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/body&gt;&lt;/html&gt;
</description><category>lattices</category><category>machine learning</category><category>mathjax</category><category>SVM</category><guid>http://asymptoticlabs.com/posts/lattice_svm.html</guid><pubDate>Fri, 13 Jan 2017 07:00:00 GMT</pubDate></item></channel></rss>