AI, Machine Learning, Reinforcement Learning, and MLOps Articles

Learn more about AI, machine learning, reinforcement learning, and MLOps with our insight-packed articles. Our AI blog delves into industrial use of AI, the machine learning blog is more technical, the reinforcement learning blog is industrially renowned, and our mlops blog discusses operational ML.

102: How to do a Data Science Project

Published
Author

Problems in Data Science

  • Understanding the problem

  • “the five-whys”

  • Different questions dramatically effect the tools and techniques used to solve the problem.


Data Science as a Process

  • More Science than Engineering
Research Problem Model

  • High risk
  • High reward
  • Difficult
  • Unpredictable

CRISP-DM Process

By Kenneth Jensen CC BY-SA 3.0, via Wikimedia Commons

Read more

101: Why Data Science?

Published
Author

What is Data Science?

  • Software Engineering, Maths, Automation, Data

  • A.k.a: Machine Learning, AI, Big Data, etc.

  • It’s current rise in popularity is due to more data and more computing power.

For more information: https://winderresearch.com/what-is-data-science/


Examples

US Supermarket Giants

  • Target: Optimising Marketing using customer spending data.

  • Walmart: Predicting demand ahead of a natural disaster.


Discovery

  • Most projects are “Discovery Projects”.

  • Primary Business goals: Increase Revenue, save costs, save time.

  • Budgets can come from other parts of the business.

    Read more

Testing Model Robustness with Jitter

Published
Author

Testing Model Robustness with Jitter

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

To test whether your models are robust to changes, one simple test is to add some noise to the test data. When we alter the magnitude of the noise, we can infer how well the model will perform with new data and different sources of noise.

In this example we’re going to add some random, normally-distributed noise, but it doesn’t have to be normally distributed! Maybe you could add some bias, or add some other type of trend!

Read more

Quantitative Model Evaluation

Published
Author

Quantitative Model Evaluation

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

We need to be able to compare models for a range of tasks. The most common use case is to decide whether changes to your model improve performance. Typically we want to visualise this, and we will in another workshop, but first we need to establish some quantitative measures of performance.

Read more

Qualitative Model Evaluation - Visualising Performance

Published
Author

Qualitative Model Evaluation - Visualising Performance

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

Being able to evaluate models numerically is really important for optimisation tasks. However, performing a visual evaluation provides two main benefits:

  • Easier to spot mistakes
  • Easier to explain to other people

It is so easy to miss a gross error when looking at summary statistics alone. Always visualise your data/results!

Read more

Hierarchical Clustering - Agglomerative

Published
Author

Hierarchical Clustering - Agglomerative Clustering

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

Clustering is an unsupervised task. In other words, we don’t have any labels or targets. This is common when you receive questions like “what can we do with this data?” or “can you tell me the characteristics of this data?”.

There are quite a few different ways of performing clustering, but one way is to form clusters hierarchically. You can form a hierarchy in two ways: start from the top and split, or start from the bottom and merge.

Read more

Evidence, Probabilities and Naive Bayes

Published
Author

Evidence, Probabilities and Naive Bayes

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

Bayes rule is one of the most useful parts of statistics. It allows us to estimate probabilities that would otherwise be impossible.

In this worksheet we look at bayes at a basic level, then try a naive classifier.

Bayes Rule

For more intuition about Bayes Rule, make sure you check out the training.

Read more

Detrending Seasonal Data

Published
Author

Detrending Seasonal Data

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

statsmodels is a comprehensive library for time series data analysis. And it has a really neat set of functions to detrend data. So if you see that your features have any trends that are time-dependent, then give this a try.

It’s essentially fitting the multiplicative model:

$y(t) = Level * Trend * Seasonality * Noise$

Read more

Visualising Underfitting and Overfitting in High Dimensional Data

Published
Author

Visualising Underfitting and Overfitting in High Dimensional Data

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

In the previous workshop we plotted the decision boundary for under and overfitting classifiers. This is great, but very often it is impossible to visualise the data, usually because there are too many dimensions in the dataset.

In thise case we need to visualise performance in another way. One way to do this is to produce a validation curve. This is a brute force approach that repeatedly scores the performanc of a model on holdout data for each parameter that you specify.

Read more

Nearest Neighbour Algorithms

Published
Author

Nearest Neighbour Algorithms

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

Nearest neighbour algorithms are a class of algorithms that use some measure of similarity. They rely on the premise that observations which are close to each other (when comparing all of the features) are similar to each other.

Making this assumption, we can do some interesting things like:

  • Recommendations
  • Find similar stuff

But more crucially, they provide an insight into the character of the data.

Read more
}