BIG DATA CONFERENCE
Mark is the CEO of Stemma and co-creator of Amundsen, an open-source data discovery and metadata engine created at Lyft and now used by 30+ companies like Instacart, Square, ING, Workday, Gusto, and more.
Trust Your Data
Over the last few years, organizations have collected, processed, and stored more data than ever before. First, you captured more data and then made it accessible to more and more people within the company. Everyone has access to data but few know what exists, what’s trustworthy, and how to use it. Humans solve this problem naturally through the gossip protocols of Slack and shoulder-tapping which doesn’t scale and comes at a huge productivity loss. But it gets worse. Wrong data -> Wrong conclusions.
Data gets delayed, deprecated, or completely shut off and analysts & data scientists are the last ones to find out. On the other side, Data Engineers don’t know exactly what and who a change will impact so they spray and pray.
So how do you enable an increasing number of data users to trust your data?
The answer lies in automated metadata. You collect and augment your data with automated documentation and make it discoverable in an automated data catalog. This enables data producers and consumers to discover trusted data on a topic, understand its nuances and best practices on how to use it.
In this talk, Mark will why trust in data matters to the data industry and the tactics best-in-class organizations use to address it — including org structure, culture, and technology.