BIG DATA CONFERENCE EUROPE 2020 ONLINE: Learn more about online attendance here




November 24-26


Confirmed Talks

Cristian Prevedello

Hands-On Chief Architect


Real-Time Stream Processing for Insurance & Health Care With Kafka, Kafka Streams and Multi-Runtime Microservices

As a service provider for insurance companies, pension & healthcare funds we rolled out a resilient stream processing platform running in kubernetes that we can scale out horizontally to integrate different microservices developed in different languages like java, scala or python.

Session Keywords
🔑 Microservices
🔑 Streaming
🔑 Kafka
🔑 Kubernetes

Nicolas Fränkel

Developer Advocate

France, Hazelcast

Introduction to Data Streaming

In this talk, Nicolas will define the context in which the old batch processing model was born, the reasons that are behind the new stream processing one, how they compare, what are their pros and cons, and a list of existing technologies implementing the latter with their most prominent characteristics.

Session Keywords
🔑 Data Streaming
🔑 Stream Processing

Carlos Manuel Duclos-Vergara

Software Engineer

Norway, Schibsted

Processing Billions of Events a Day Using Kafka and Kafka Streams

Designing a system to cope with loads of billions of events is harder than it seems. In this talk the presenter will go through the most common use cases and pitfalls and provide tips and good practices about how to design systems to avoid them.

Session Keywords
🔑 Kafka
🔑 Streaming
🔑 Schibsted

Marko Topolnik

Software Engineer

Croatia, Hazelcast

Real-Time Streaming with Python ML Inference

In this talk Marko will show one approach which allows you to write a low-latency, auto-parallelized and distributed stream processing pipeline in Java that seamlessly integrates with a data scientist’s work taken in almost unchanged form from their Python development environment.

Session Keywords
🔑 Java
🔑 Python
🔑 ML
🔑 Streaming

Timothy J Spann

Principal Field Engineer

USA, Cloudera

Introduction to FLaNK Stack

Introducing the FLaNK stack which combines Apache Flink, Apache NiFi, Apache Kafka and Apache Kudu to build fast applications for IoT, AI, rapid ingest.

Session Keywords
🔑 NiFi
🔑 Streaming
🔑 Kafka
🔑 Kudu

John Mertic

Director of Program Management

USA, Linux Foundation

The New ODPi – Moving from Standards to a Vendor-Neutral Home for Big Data Open Source

In this talk, learn about the new ODPi, how it’s leverage the expertise of the Linux Foundation in hosting vendor-neutral open source projects, and how you can bring your project to ODPi.

Session Keywords
🔑 ODPi Egeria
🔑 Open Source

Barr Moses

CEO & Co-Founder of Data/Analytics startup backed by top Silicon Valley investors

USA, Monte Carlo

Making Data Downtime a Pillar of Your Data Strategy

Barr will introduce the concept of “data downtime” — periods of time when data is partial, erroneous, missing or otherwise inaccurate. Data downtime is highly costly for organizations, yet is often addressed ad hoc. She’ll discuss why data downtime matters to the data industry and tactics best-in-class organizations use to address it — including org structure, culture, and technology.

Session Keywords
🔑 Data Reliability
🔑 Data Quality
🔑 Data Strategy

Yiannis Kanellopoulos

Technology Startup Founder

Greece, Code4Thought

Trust and Quality in the Era of Software 2.0

In his talk Yiannis Kanellopoulos will present an approach on how an ML model can be evaluated in terms of its Fairness, Accountability and Transparency. Using examples of case studies (from industrial and publicly available datasets) Yiannis will share insights and the benefits one can get by making a ML model accountable, transparent and trying to mitigate its biases.

Session Keywords
🔑 AI Accountability
🔑 AI Fairness
🔑 AI Transparency

Einar Or

Technology Startup Founder

Greece, Code4Thought

Data Versioning – What Does it Mean?

In this talk we will go over the difference between these solutions by clustering them according to 4 main use cases:
1. Collaboration over data: enabling teams to collaborate over data over time, while contributing to the data evolution.
2. Managing ML pipelines: allowing pipeline management of ML projects, from model creation to production.

Session Keywords
🔑 Data Lake
🔑 Data Versioning

Kane Wu

Co-Founder and CEO

Hong Kong, ThinkCol

Data Science Case Studies and Formulation of AI Roadmap

From discussing what is AI to practical case studies of AI, Kane will discuss how companies in Hong Kong and world wide uses AI to create business values.

Session Keywords
🔑 Machine Learning
🔑 Data Science

Mats Adamczak

Data Scientist

Finland, paf

The Importance of Good Data Quality and Understanding of Visitor Behavior

In the session, Mats goes through how to measure a person’s journey from being a person in an interesting segment to becoming a customer. He also tells you how to get high data quality on your web visitors so that you can use it in machine learning in Facebook and google.

Session Keywords
🔑 Data Quality
🔑 Behavior Economics

Dr. Christoph Zimmermann

European Open Source Thought Leader

Germany, Redis Labs

Redis: a Multi-Model DB for IoT and Beyond

An overview of Redis, an open-source multi-model NoSQL DB as a foundation for Big Data projects in the Internet of Things ecosystem and beyond.

Session Keywords
🔑 IoT
🔑 Redis

Mandy Chessell

IBM Distinguished Engineer, Master Inventor

UK, ODPi TSC & ODPi Egeria & IBM

Graph Processing for Open Metadata and Governance

Learn how ODPi Egeria uses its distributed virtual graph to connect metadata about an enterprise’s data and IT services from many different tools and then apply governance across this landscape.

Session Keywords
🔑 Metadata
🔑 Governance
🔑 Open Source

Ricardo Ferreira

Developer Advocate

USA, Elastic

Best Practices for Building Streaming Data Architectures

This talk will introduce the main building blocks for a streaming data architecture and how they can be put together to address business problems.

Session Keywords
🔑 Streaming Analytics
🔑 Kafka
🔑 Kinesis
🔑 Pulsar