26-28 November, 2019, Vilnius
Conference is over! See you next year.
Coolblue B.V., The Netherlands
Gerard Toonstra is a technical lead for the data systems team at Coolblue in The Netherlands, a kaggle master and artist.
Data Discovery with Amundsen
Company data is increasingly widespread and it has always been difficult to understand who is using which tables or columns, how often and how the dataset was produced. There are commercial tools available that assist for some of these questions, but now there’s also an open-source tool called “Amundsen”, which is already in use by several larger companies. Amundsen helps you answer questions like how a dataset was produced, who else uses it, who’s the owner of a dataset and people using the tool can update the table or column descriptions. Amundsen isn’t around for a very long time, but data lineage and data discovery are already significant problems or at least challenges for companies; Amundsen is a strong answer to these problems, because it doesn’t just focus on the data itself, but also on the people and communities around the data that are using it.
Migrating Our Data Warehouse to the Cloud
Recently we finished our migration to the cloud. We’ve always had our data warehouse on site running on metal boxes, but a strong preference existed to make everything cloud based. What are the issues you’ll be facing going to the cloud? What are the technical challenges ahead and will it be cheaper? Should we rebuild everything new? I’m going to talk about the decisions we had to make, the impact of legacy code on a cloud migration, if it’s going to be cheaper or not. Because the way how things are done in the cloud is slightly different from a datacenter approach, I hope that this talk about my experiences helps you make better decisions and establish a better timeline if you’re facing a similar migration.