Machine Learning

Industrial insight and articles from Winder.AI, focusing on the topic Machine Learning

Visualising Underfitting and Overfitting in High Dimensional Data

Published: Dec 20, 2017
Author

Visualising Underfitting and Overfitting in High Dimensional Data Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. In the previous workshop we plotted the decision boundary for under and overfitting classifiers. This is great, but very often it is impossible to visualise the data, usually because there are too many dimensions in the dataset. In thise case we need to visualise performance in another way.

Nearest Neighbour Algorithms

Published: Dec 20, 2017
Author

Nearest Neighbour Algorithms Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. Nearest neighbour algorithms are a class of algorithms that use some measure of similarity. They rely on the premise that observations which are close to each other (when comparing all of the features) are similar to each other. Making this assumption, we can do some interesting things like: Recommendations Find similar stuff But more crucially, they provide an insight into the character of the data.

K-NN For Classification

Published: Dec 20, 2017
Author

K-NN For Classification Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. In a previous workshop we investigated how the nearest neighbour algorithm uses the concept of distance as a similarity measure. We can also use this concept of similarity as a classification metric. I.e. new observations will be classified the same as its neighbours. This is accomplished by finding the most similar observations and setting the predicted classification as some combination of the k-nearest neighbours.

Overfitting and Underfitting

Published: Nov 26, 2017
Author

Underfitting and Overfitting Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. Imagine you had developed a model that predicts some output. The goal of any model is to generate a correct prediction and avoid incorrect predictions. But how can we be sure that predictions are as good as they can possibly be? Now constrain your imagining to a classification task (other tasks have similar properties but I find classification easiest to reason about).

Support Vector Machines

Published: Nov 24, 2017
Author

Support Vector Machines Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. If you remember from the video training, SVMs are classifiers that attemt to maximise the separation between classes, no matter what the distribution of the data. This means that they can sometimes fit noise more than they fit the data. But because they are aiming to separate classes, they do a really good job at optimising for accuracy.

Logistic Regression

Published: Nov 22, 2017
Author

Logistic Regression Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. I find the name logistic regression annoying. We don’t normally use logistic regression for anything other than classification; but statistics coined the name long ago. Despite the name, logistic regression is incredibly useful. Instead of optimising the error of the distance like we did in standard linear regression, we can frame the problem probabilistically.

Linear Classification

Published: Nov 22, 2017
Author

Linear Classification Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. We learnt that we can use a linear model (and possibly gradient descent) to fit a straight line to some data. To do this we minimised the mean-squared-error (often known as the optimisation/loss/cost function) between our prediction and the data. It’s also possible to slightly change the optimisation function to fit the line to separate classes.

Regression: Dealing With Outliers

Published: Nov 17, 2017
Author

Regression: Dealing with Outliers Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. Outliers are observations that are spurious. You can usually spot outliers visually; they are often far away from the rest of the observations. Sometimes they are caused by a measurement error, sometimes noise and occasionally they can be observations of interest (e.g. fraud detection). But outliers skew the estimates of the mean and standard deviation and therefore affect linear models that use error measures that assume normality (e.

Linear Regression

Published: Nov 17, 2017
Author

Linear Regression Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. Regression is a traditional task from statistics that attempts to fit model to some input data to predict the numerical value of an output. The data is assumed to be continuous. The goal is to be able to take a new observation and predict the output with minmal error. Some examples might be “what will next quater’s profits be?

Introduction to Gradient Descent

Published: Nov 17, 2017
Author

Introduction to Gradient Descent Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. For only a few algorithms an analytical solution exists. For example, we can use the Normal Equation to solve a linear regression problem directly. However, for most algorithms we rely cannot solve the problem analytically; usually because it’s impossible to solve the equation. So instead we have to try something else.