503: Visualising Overfitting in High Dimensional Problems
Often we can't visualise the decision boundary because there are too many dimensions. This training video shows you how to visualise overfitting when you have a high dimensional problem.
Often we can't visualise the decision boundary because there are too many dimensions. This training video shows you how to visualise overfitting when you have a high dimensional problem.
To prevent fooling ourselves into thinking we are doing well, we hold back some data. Learn how in this training video.
Overfitting is one of the most common data science problems. This training video explains why should generalise and prevent overfitting.
Sometimes linear decision boundaries aren't complex enough to perform well. This training video shows you how to transform your data to produce nonlinear boundaries.
This training video explains how the most popular linear classification algorithms work.
This training video explains why we need gradient descent and how it works.
This section introduces using linear models for regression. Later we will expand our use of linear models for classification purposes.
In the previous video we discussed what bad data is. In this video we discuss how we can alter the data to improve it. Techniques range from rescaling and transforming data to creating new features from scratch.
In this video we introduce the topic of data engineering. Understanding your data is so vitally important. It is the raw material you use to create results. The common phrase 'garbage in, garbage out' (excuse my American!) summarises the ability to win or ruin data science projects and entire products with good or poor data. We will discuss common pitfalls and introduce steps to overcome them.
This video shows an example of using segmentation on categorical data and we'll also find that we have just derived our first important machine learning algorithm: a decision tree.
Case studies and industry analysis from our team. No hype, roughly monthly.