AI, Machine Learning, Reinforcement Learning, and MLOps Articles

Learn more about AI, machine learning, reinforcement learning, and MLOps with our insight-packed articles. Our AI blog delves into industrial use of AI, the machine learning blog is more technical, the reinforcement learning blog is industrially renowned, and our mlops blog discusses operational ML.

301: Data Engineering

Published
Author
Thumbnail for 301: Data Engineering

Your job depends on your data

The goal of this section is to:

  • Talk about what data is and the context provided by your domain
  • Discover how to massage data to produce the best results
  • Find out how and where we can discover new data

???

If you have inadequate data you will not be able to succeed in any data science task.

More generally, I want you to focus on your data. It is necessary to understand your data before building a model.

Read more

203: Examples and Decision Trees

Published
Author
Thumbnail for 203: Examples and Decision Trees

Example: Segmentation via Information Gain

There’s a fairly famous dataset called the “mushroom dataset”.

It describes whether mushrooms are edible or not, depending on an array of features.

The nice thing about this dataset is that the features are all catagorical.

So we can go through and segment the data for each value in a feature.

This is some example data:

poisonouscap-shapecap-surfacecap-colorbruises?
pxsnt
exsyt
ebswt
pxywt
exsgf

etc.

Read more

202: Segmentation For Classification

Published
Author
Thumbnail for 202: Segmentation For Classification

Segmentation

So let’s walk through a very visual, intuitive example to help describe what all data science algorithms are trying to do.

This will seem quite complicated if you’ve never done anything like this before. That’s ok!

I want to do this to show you that all algorithms that you’ve every heard of have some very basic assumption of what they are trying to do.

At the end of this, we will have completely derived one very important type of classifier.

Read more

201: Basics and Terminology

Published
Author
Thumbnail for 201: Basics and Terminology

The ultimate goal

First lets discuss what the goal is. What is the goal?

  • The goal is to make a decision or a prediction

Based upon what?

  • Information

How can we improve the quality of the decision or prediction?

  • The quality of the solution is defined by the certainty represented by the information.

Think about this for a moment. It’s a key insight. Think about your projects. Your research. The decisions you make. They are all based upon some information. And you can make better decisions when you have more good quality information.

Read more

102: How to do a Data Science Project

Published
Author
Thumbnail for 102: How to do a Data Science Project

Problems in Data Science

  • Understanding the problem

  • “the five-whys”

  • Different questions dramatically effect the tools and techniques used to solve the problem.


Data Science as a Process

  • More Science than Engineering
Research Problem Model

  • High risk
  • High reward
  • Difficult
  • Unpredictable

CRISP-DM Process

By Kenneth Jensen CC BY-SA 3.0, via Wikimedia Commons

Read more

101: Why Data Science?

Published
Author
Thumbnail for 101: Why Data Science?

What is Data Science?

  • Software Engineering, Maths, Automation, Data

  • A.k.a: Machine Learning, AI, Big Data, etc.

  • It’s current rise in popularity is due to more data and more computing power.

For more information: https://winderresearch.com/what-is-data-science/


Examples

US Supermarket Giants

  • Target: Optimising Marketing using customer spending data.

  • Walmart: Predicting demand ahead of a natural disaster.


Discovery

  • Most projects are “Discovery Projects”.

  • Primary Business goals: Increase Revenue, save costs, save time.

  • Budgets can come from other parts of the business.

    Read more

Testing Model Robustness with Jitter

Published
Author
Thumbnail for Testing Model Robustness with Jitter

Testing Model Robustness with Jitter

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

To test whether your models are robust to changes, one simple test is to add some noise to the test data. When we alter the magnitude of the noise, we can infer how well the model will perform with new data and different sources of noise.

In this example we’re going to add some random, normally-distributed noise, but it doesn’t have to be normally distributed! Maybe you could add some bias, or add some other type of trend!

Read more

Quantitative Model Evaluation

Published
Author
Thumbnail for Quantitative Model Evaluation

Quantitative Model Evaluation

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

We need to be able to compare models for a range of tasks. The most common use case is to decide whether changes to your model improve performance. Typically we want to visualise this, and we will in another workshop, but first we need to establish some quantitative measures of performance.

Read more

Qualitative Model Evaluation - Visualising Performance

Published
Author
Thumbnail for Qualitative Model Evaluation - Visualising Performance

Qualitative Model Evaluation - Visualising Performance

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

Being able to evaluate models numerically is really important for optimisation tasks. However, performing a visual evaluation provides two main benefits:

  • Easier to spot mistakes
  • Easier to explain to other people

It is so easy to miss a gross error when looking at summary statistics alone. Always visualise your data/results!

Read more

Hierarchical Clustering - Agglomerative

Published
Author
Thumbnail for Hierarchical Clustering - Agglomerative

Hierarchical Clustering - Agglomerative Clustering

Welcome! This workshop is from Winder.ai. Sign up to receive more free workshops, training and videos.

Clustering is an unsupervised task. In other words, we don’t have any labels or targets. This is common when you receive questions like “what can we do with this data?” or “can you tell me the characteristics of this data?”.

There are quite a few different ways of performing clustering, but one way is to form clusters hierarchically. You can form a hierarchy in two ways: start from the top and split, or start from the bottom and merge.

Read more
}