AI, Machine Learning, Reinforcement Learning, and MLOps Articles

Visualising Underfitting and Overfitting in High Dimensional Data

Published: Dec 20, 2017
Author

Visualising Underfitting and Overfitting in High Dimensional Data Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. In the previous workshop we plotted the decision boundary for under and overfitting classifiers. This is great, but very often it is impossible to visualise the data, usually because there are too many dimensions in the dataset. In thise case we need to visualise performance in another way.

Published: Dec 20, 2017
Author

Nearest Neighbour Algorithms Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. Nearest neighbour algorithms are a class of algorithms that use some measure of similarity. They rely on the premise that observations which are close to each other (when comparing all of the features) are similar to each other. Making this assumption, we can do some interesting things like: Recommendations Find similar stuff But more crucially, they provide an insight into the character of the data.

Published: Dec 20, 2017
Author

K-NN For Classification Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. In a previous workshop we investigated how the nearest neighbour algorithm uses the concept of distance as a similarity measure. We can also use this concept of similarity as a classification metric. I.e. new observations will be classified the same as its neighbours. This is accomplished by finding the most similar observations and setting the predicted classification as some combination of the k-nearest neighbours.

Published: Dec 4, 2017
Author: Dr. Phil Winder
CEO

https://prometheus.io is an open source time series database that focuses on capturing measurements and exposing them via an API. I love Prometheus because it it so simple; it’s minimalism is its greatest feature. It achieves this by pulling metrics from instrumented applications, not pulling like many of its competitors. In other words Prometheus “scrapes” the metrics from the application.

This means that it works very well in a distributed, cloud-native environment. All of the services are unburdened by load on the monitoring system. This has knock on effects meaning that HA is supported through simple duplication and scaling is supported through segmentation.

Published: Nov 30, 2017
Author: Dr. Phil Winder
CEO

What do you mean by monitoring? Why do you need it? What are the real needs and are you monitoring them? Ask yourself these questions. Can you answer them? If not, you’re probably doing monitoring wrong.

This post asks the basic question. What is monitoring? How does it compare to logging and tracing? Let’s find out.

Published: Nov 26, 2017
Author

Underfitting and Overfitting Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. Imagine you had developed a model that predicts some output. The goal of any model is to generate a correct prediction and avoid incorrect predictions. But how can we be sure that predictions are as good as they can possibly be? Now constrain your imagining to a classification task (other tasks have similar properties but I find classification easiest to reason about).

Published: Nov 24, 2017
Author

Support Vector Machines Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. If you remember from the video training, SVMs are classifiers that attemt to maximise the separation between classes, no matter what the distribution of the data. This means that they can sometimes fit noise more than they fit the data. But because they are aiming to separate classes, they do a really good job at optimising for accuracy.

Published: Nov 22, 2017
Author

Logistic Regression Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. I find the name logistic regression annoying. We don’t normally use logistic regression for anything other than classification; but statistics coined the name long ago. Despite the name, logistic regression is incredibly useful. Instead of optimising the error of the distance like we did in standard linear regression, we can frame the problem probabilistically.

Published: Nov 22, 2017
Author

Linear Classification Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. We learnt that we can use a linear model (and possibly gradient descent) to fit a straight line to some data. To do this we minimised the mean-squared-error (often known as the optimisation/loss/cost function) between our prediction and the data. It’s also possible to slightly change the optimisation function to fit the line to separate classes.

Published: Nov 17, 2017
Author

Regression: Dealing with Outliers Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos. Outliers are observations that are spurious. You can usually spot outliers visually; they are often far away from the rest of the observations. Sometimes they are caused by a measurement error, sometimes noise and occasionally they can be observations of interest (e.g. fraud detection). But outliers skew the estimates of the mean and standard deviation and therefore affect linear models that use error measures that assume normality (e.

AI, Machine Learning, Reinforcement Learning, and MLOps Articles

Visualising Underfitting and Overfitting in High Dimensional Data

Nearest Neighbour Algorithms

K-NN For Classification

Introduction to Monitoring Microservices with Prometheus

Logging vs Tracing vs Monitoring

Overfitting and Underfitting

Support Vector Machines

Logistic Regression

Linear Classification

Regression: Dealing With Outliers