Winder.AI Blog

Industrial AI insight about machine learning, reinforcement learning, MLOps, and more...

COVID-19 Logistic Bayesian Model

COVID-19 Logistic Bayesian Model

Apr 2020, by phil-winder, in Data Science

This post builds upon the exponential model created in a previous post. The main issue was that there an exponential model does not include a limit. A logistic model introduces this limit. I also perform some very basic backtesting and future prediction.

COVID-19 Exponential Bayesian Model

COVID-19 Exponential Bayesian Model

Apr 2020, by phil-winder, in Data Science

The purposes of this notebook is to provide initial experience with the pymc3 library for the purpose of modeling and forecasting COVID-19 virus summary statistics. This model is very simple, and therefore not very accurate, but serves as a good introduction to the topic.

How to Start a Data Science Project With No or Little Data

How to Start a Data Science Project With No or Little Data

Feb 2020, by Hajar Khizou, in Data Science

Data is an essential asset of modern business. It empowers companies by surfacing unique insights about their customers and creates actionable products. The more data you possess, the better you meet and exceed your customers' expectations.

Keep it Clean: Why Bad Data Ruins Projects and How to Fix it - NDC London

Jan 2020, in Data Science, Talk

Slides Abstract The Internet is full of examples of how to train models. But the reality is that industrial projects spend the majority of the time working with data. The largest improvements in performance can often be found through improving the underlying data. Bad data is costing the US economy an estimated 3.1 trillion Dollars and approximately 27% of data is flawed in the world’s top companies.

Keep it Clean: Why Bad Data Ruins Projects and How to Fix it - GOTO Berlin

Keep it Clean: Why Bad Data Ruins Projects and How to Fix it - GOTO Berlin

Oct 2019, by Phil Winder, in Talk, Data Science

Abstract The Internet is full of examples of how to train models. But the reality is that industrial projects spend the majority of the time working with data. The largest improvements in performance can often be found through improving the underlying data. Bad data is costing the US economy an estimated 3.1 trillion Dollars and approximately 27% of data is flawed in the world’s top companies. Bad data also contributes to the failure of many Data Science projects.

Fast Time-Series Filters in Python

Fast Time-Series Filters in Python

Oct 2019, by phil-winder, in Machine Learning

Time-series (TS) filters are often used in digital signal processing for distributed acoustic sensing (DAS). The goal is to remove a subset of frequencies from a digitised TS signal. To filter a signal you must touch all of the data and perform a convolution. This is a slow process when you have a large amount of data. The purpose of this post is to investigate which filters are fastest in Python.

A Comparison of Reinforcement Learning Frameworks: Dopamine, RLLib, Keras-RL, Coach, TRFL, Tensorforce, Coach and more

Jul 2019, by Phil Winder, in Reinforcement Learning

Reinforcement Learning (RL) frameworks help engineers by creating higher level abstractions of the core components of an RL algorithm. This makes code easier to develop, easier to read and improves efficiency.

But choosing a framework introduces some amount of lock in. An investment in learning and using a framework can make it hard to break away. This is just like when you decide which pub to visit. It’s very difficult not to buy a beer, no matter how bad the place is.