AI, Machine Learning, Reinforcement Learning, and MLOps Articles

Keep it Clean: Why Bad Data Ruins Projects and How to Fix it

Published: Oct 24, 2019
Author: Dr. Phil Winder
CEO

Slides Abstract The Internet is full of examples of how to train models. But the reality is that industrial projects spend the majority of the time working with data. The largest improvements in performance can often be found through improving the underlying data. Bad data is costing the US economy an estimated 3.1 trillion Dollars and approximately 27% of data is flawed in the world’s top companies. Bad data also contributes to the failure of many Data Science projects.

Published: Oct 4, 2019
Author: Dr. Phil Winder
CEO

Time-series (TS) filters are often used in digital signal processing for distributed acoustic sensing (DAS). The goal is to remove a subset of frequencies from a digitised TS signal. To filter a signal you must touch all of the data and perform a convolution. This is a slow process when you have a large amount of data. The purpose of this post is to investigate which filters are fastest in Python.

Published: Jul 1, 2019
Author: Dr. Phil Winder
CEO

Reinforcement Learning (RL) frameworks help engineers by creating higher level abstractions of the core components of an RL algorithm. This makes code easier to develop, easier to read and improves efficiency.

But choosing a framework introduces some amount of lock in. An investment in learning and using a framework can make it hard to break away. This is just like when you decide which pub to visit. It’s very difficult not to buy a beer, no matter how bad the place is.

Published: Jun 25, 2019
Author: Dr. Phil Winder
CEO

I’m excited to announce that I have agreed with O’Reilly Media to write a new book on Reinforcement Learning. The contracts have just been signed and I’ve started the writing process. It is likely to take around a year to be released so I’m hoping that it will be ready around Summer 2020.

Published: Apr 12, 2019
Author: Dr. Phil Winder
CEO

At their Cloud’s Next 19 conference, Google has announced the launch of an expanded AI platform. For a number of years Google has been expanding it’s portfolio to compete with AI products from Azure and AWS. But this is the first time that the platform can be considered “end-to-end”.

Published: Mar 28, 2019
Author: Dr. Phil Winder
CEO

I’ve seen a few posts recently about the emergence of a new field that is kind of like DevOps, but not quite, because it involves too much data. Verbally, about two years ago and in blog form about a year ago, I used the word DataDevOps, because that’s what I did. I develop and operate Data Science platforms, products and services. But more recently I have read of the emergence of DataOps.

Published: Mar 11, 2019
Author: Dr. Phil Winder
CEO

Developing Jenkinsfile pipelines is hard. I think my world record for the number of attempts to get a working Jenkinsfile is around 20. When you have to continually push and run your pipeline on a managed Jenkins instance, the feedback cycle is long. And the primary bottleneck to developer productivity is the length of the feedback cycle.

Published: Feb 3, 2019
Author: Dr. Phil Winder
CEO

Nearly everyone using Python for Data Science has used or is using the Pandas Data Analysis/Preprocessing library. It is as much of a mainstay as Scikit-Learn. Despite this, one continuing bugbear is the different core data types used by each: pandas.DataFrame and np.array. Wouldn’t it be great if we didn’t have to worry about converting DataFrames to numpy types and back again? Yes, it would. Step forward Scikit Pandas. Sklearn Pandas Sklearn Pandas, part of the Scikit Contrib package, adds some syntactic sugar to use Dataframes in sklearn pipelines and back again.

Published: Jan 14, 2019
Author: Dr. Phil Winder
CEO

Helm is billed as “the package manager for Kubernetes”. The goal was to provide a high-level package management-like experience for Kubernetes. This was a goal for all the major containerisation platforms. For example, Apache Mesos has Mesos Frameworks. And given the standardisation on package management at an OS level (yum, apt-get, brew, choco, etc.) and an application level (npm, pip, gem, etc.), this makes total sense, right?

Published: Oct 28, 2018
Author: Dr. Phil Winder
CEO

Executive Summary Winder.AI was engaged by Bitsensor to research and implement Data Science algorithms that could automate the detection and classification of web attackers. After gathering data, researching a Machine Learning solution and implementing Cloud-Native software, we delivered three new features: Tool classification - detect which automated tools were being used to perform the attack Attacker grouping - provide the capability of detecting distributed attacks by the same attacker Killchain classification - establish the phase of an attack (e.

AI, Machine Learning, Reinforcement Learning, and MLOps Articles

Keep it Clean: Why Bad Data Ruins Projects and How to Fix it

Fast Time-Series Filters in Python

A Comparison of Reinforcement Learning Frameworks

Announcement: New Reinforcement Learning Book with O'Reilly

Google Releases AI Platform with help from Winder.AI

DevOps and Data Science: DataDevOps?

Local Jenkins Development Environment on Minikube on OSX

Scikit Learn to Pandas: Data types shouldn't be this hard

7 Reasons Why You Shouldn't Use Helm in Production

Using Data Science to block hackers