Thu Oct 24, 2019, by Phil Winder, in Talk, Data Science
Abstract The Internet is full of examples of how to train models. But the reality is that industrial projects spend the majority of the time working with data. The largest improvements in performance can often be found through improving the underlying data. Bad data is costing the US economy an estimated 3.1 trillion Dollars and approximately 27% of data is flawed in the world’s top companies. Bad data also contributes to the failure of many Data Science projects.