26-28 November, 2019, Vilnius

Conference is over! See you next year.

Confirmed Talks

Carlos Manuel Duclos-Vergara

Schibsted, Norway


Routing Billions of Events a Day: How we Do Routing in Schibsted

In this talk Carlos will present how his team in Schibsted have set up a streaming data platform using cloud technologies, Kafka and Kafka streams. Schibsted is a leader in media and online classifieds in the Nordic market, delivering news to most of the population of Scandinavia. The streaming platform aims to simplify data management, providing a single point to manage data streams independently of data types. It includes a data quality solution, a data registration solution and a routing solution. Kafka is used as the backbone of the system and the routing solution is built on top of it. Kafka streams is used downstream to enrich streams

Session Keywords


John Ortega

New York University, USA


How can Artificial Intelligence use Big Data for Translating Documents?

In this session, John will walk us through the exciting world of machine translation. Specifically, he will show us how documents, known as corpora, filled with information from various sources can be used to provide artificial intelligence to a translation system. He will cover the evolution of statistical and neural-based translation systems and how they are currently used to achieve a superior translation quality compared to their predecessors.

Session Keywords

Data Science
Machine Learning

Sonya Liberman

Outbrain, Israel


From Spark to Elasticsearch and Back - Learning Large Scale Models for Content Recommendation

Serving tens of billions of personalized recommendations a day under a latency of 30 milliseconds is a challenge. In this talk I’ll share our algorithmic architecture, including its Spark-based offline layer, and its Elasticsearch-based serving layer, that enable running complex models under difficult scale constrains and shorten the cycle between research and production.

Session Keywords

Machine Learning
Recommender System

Ricardo Ferreira

Confluent, USA


Everything you Wanted to Know about Apache Kafka but you Were too Afraid to Ask! 

Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration, with Apache Kafka at the core, streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what a streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use. 

Session Keywords

Stream Processing

Florian Wilhelm

inovex GmbH, Germany


Are You Sure about That?! Uncertainty Quantification in AI

With the advent of Deep Learning (DL), the field of AI made a giant leap forward and it is nowadays applied in many industrial use-cases. Especially critical systems like autonomous driving, require that DL methods not only produce a prediction but also state the certainty about the prediction in order to assess risks and failure. 
In my talk, I will give an introduction to different kinds of uncertainty, i.e. epistemic and aleatoric.

Session Keywords

Uncertainty Quantification
Deep Learning
Gaussian Processes
Bayesian Methods

Ander Alcon

Olocip, Spain


Real Time Decision Making in Professionals Sports. From Description to Prediction and Prescription.

Thanks to the amount of data that has been collected in recent decades in the world of football, we try to answer the right questions that have always existed around it, and from dimensions like never have been answered, such as for example How will a player perform next season in my team, What should I do to avoid conceding a goal? or What should I do so that players are not injured?

Session Keywords

Artificial Intelligence in Football

Toby Woolfe



Big Data Information Architecture for AI

IBM is famous for Watson, the artificial intelligence system which won the Jeopardy! general knowledge quiz show in 2011. Since then IBM have been feeding it big data and it is being used across industry to help humans work smarter. Here’s a presentation on some of the more interesting case studies from around the world such as Watson Willow at Woodside; the L’Oreal factory and Iplexia demo to show a factory line manager talking to Watson…

Session Keywords

Artificial Intelligence

Gerard Toonstra

Coolblue B.V., The Netherlands

#1 Talk

Data Discovery with Amundsen

Company data is increasingly widespread and it has always been difficult to understand who is using which tables or columns, how often and how the dataset was produced. There are commercial tools available that assist for some of these questions, but now there’s also an open-source tool called “Amundsen”, which is already in use by several larger companies. Amundsen helps you answer questions like how a dataset was produced, who else uses it, who’s the owner of a dataset and people using the tool can update the table or column descriptions.

Session Keywords

Data Discovery
Data Lineage

#2 Talk

Migrating Our Data Warehouse to the Cloud

Recently we finished our migration to the cloud. We’ve always had our data warehouse on site running on metal boxes, but a strong preference existed to make everything cloud based. What are the issues you’ll be facing going to the cloud? What are the technical challenges ahead and will it be cheaper? Should we rebuild everything new? I’m going to talk about the decisions we had to make, the impact of legacy code on a cloud migration, if it’s going to be cheaper or not. Because the way how things are done in the cloud is slightly different from a datacenter approach, I hope that this talk about my experiences helps you make better decisions and establish a better timeline if you’re facing a similar migration.

Session Keywords

SQL Server

Kathrin Melcher

KNIME, Germany


The Creative Side of AI: Product Name Generation

Product naming is not an easy task. It requires creativity and a goal-driven process. Can AI help? We propose here a solution to get a very large number in very short time of name candidates to inspire your marketing team from a many-to-many LSTM network.
We will explore Recurrent Neural Networks in general and LSTM layers in particular and why they work so well for sequence generation.

Session Keywords

Text Generation
Deep Learning

Pawel Rzeszucinski

Codewise, Poland


Computational Propaganda - How Algorithms Influence our Decisions

“Manipulations in the era of algorithms”. Facebook having 2.2 Monthly Active Users, together with Google and Twitter create three interconnected digital republics where majority of human interactions take place and where increasing amounts of people spend great amount of time daily. Such constellation created fantastic environment for development and usage of targeted ad industry. The same algorithms however have recently gained momentum in being used by bad actors in creation of the so called computational propaganda…

Session Keywords