How to Start a Data Science Project With No or Little Data

Published
Author
Hajar Khizou
Associate Data Science Content Editor

Data is an essential asset of modern business. It empowers companies by surfacing unique insights about their customers and creates actionable products. The more data you possess, the better you meet and exceed your customers’ expectations.

Read more

Keep it Clean: Why Bad Data Ruins Projects and How to Fix it

Published
Author
Dr. Phil Winder
CEO


Slides

Abstract

The Internet is full of examples of how to train models. But the reality is that industrial projects spend the majority of the time working with data. The largest improvements in performance can often be found through improving the underlying data.

Read more

Google Releases AI Platform with help from Winder.AI

Published
Author
Dr. Phil Winder
CEO

At their Cloud’s Next 19 conference, Google has announced the launch of an expanded AI platform. For a number of years Google has been expanding it’s portfolio to compete with AI products from Azure and AWS. But this is the first time that the platform can be considered “end-to-end”.

Read more

DevOps and Data Science: DataDevOps?

Published
Author
Dr. Phil Winder
CEO

I’ve seen a few posts recently about the emergence of a new field that is kind of like DevOps, but not quite, because it involves too much data. Verbally, about two years ago and in blog form about a year ago, I used the word DataDevOps, because that’s what I did. I develop and operate Data Science platforms, products and services.

But more recently I have read of the emergence of DataOps. Apparently enterprises have realised that it takes more than a PhD in Data Science to create products and value (not that I begrudge the value of a PhD, I have one, after all!). It also takes engineering. Specifically, software engineering, to perform a series of tasks that support the wafer-thin slice of the product cake that represents the Data Science model.

Read more

Using Data Science to block hackers

Published
Author
Dr. Phil Winder
CEO

Executive Summary

Winder.AI was engaged by Bitsensor to research and implement Data Science algorithms that could automate the detection and classification of web attackers. After gathering data, researching a Machine Learning solution and implementing Cloud-Native software, we delivered three new features:

  • Tool classification - detect which automated tools were being used to perform the attack
  • Attacker grouping - provide the capability of detecting distributed attacks by the same attacker
  • Killchain classification - establish the phase of an attack (e.g. reconnaissance, exploitation, etc.)

Client

Bitsensor is a startup in the Netherlands that specializes in protecting public-facing websites and applications. They distribute their web-application firewall product to a range of customers throughout Europe. The goal is to provide an outstanding out-of-the-box experience that can protect exposed services from hackers, with little setup.

Read more

Cloud Native Data Science: Best Practices

Published
Author
Dr. Phil Winder
CEO

Following the Cloud Native best practices of immutability, automation and provenance will serve you well in a CNDS project. But working with data brings its own subtle challenges around these themes.

Read more

Cloud Native Data Science: Technology

Published
Author
Dr. Phil Winder
CEO

Technology choices in data-driven products are, as you would expect, largely directed by the type and amount of data. The first and most crucial decision to make is whether the data will be processed in a batch or streaming fashion.

Read more

Cloud Native Data Science: Strategy

Published
Author
Dr. Phil Winder
CEO

Data Science has become an important part of any business because it provides a competitive advantage. Very early on, Amazon’s data on book purchases allowed them to deliver personalised recommendations whilst customers were browsing their site. Their main competitor in the US at the time was Borders, who mainly operated in physical stores. This physicality prevented them from seamlessly providing customers with personalised recommendations [1]. This example highlights how strategic business decisions and data science are inextricably linked.

Read more

Life and Death Decisions: Testing Data Science

Published
Author

Abstract

We live in a world where decisions are being made by software. From mortgage applications to driverless vehicles, the results can be life-changing. But the benefits of automation are clear. If businesses use data science to automate decisions they will become more productive and more profitable.

So the question becomes: how can we be sure that these algorithms make the best decisions? How can we prove that an autonomous vehicle will make the right decision when life depends on it? How can we prove that data science works?

Read more

201: Basics and Terminology

Published
Author

The ultimate goal

First lets discuss what the goal is. What is the goal?

  • The goal is to make a decision or a prediction

Based upon what?

  • Information

How can we improve the quality of the decision or prediction?

  • The quality of the solution is defined by the certainty represented by the information.

Think about this for a moment. It’s a key insight. Think about your projects. Your research. The decisions you make. They are all based upon some information. And you can make better decisions when you have more good quality information.

Read more
}