BIG DATA CONFERENCE

EUROPE 2021

Online Edition

September 28-30

Online

Confirmed Talks

Albert Lewandowski

GetInData, Poland

Best Practices for ETL with Apache NiFi on Kubernetes

During the talk, there are described all details about migrating pipelines from the old Hadoop platform to the Kubernetes, managing everything as the code, monitoring all corner cases of NiFi and making it a robust solution that is user-friendly even for non-programmers.

Session Keywords
🔑 ETL
🔑 NiFi
🔑 Kubernetes

Bas Geerdink

Aizonic, The Netherlands

The State of MLOps – Machine Learning in Production at Enterprise Scale

In this session, we’ll explore this relatively new subject. Bas will explain the need for MLOps (and AIOps and ModelOps which are related), dive into the tools and techniques, and give some examples of real-world solutions.

Session Keywords
🔑 MLOps
🔑 Big Data
🔑 Machine Learning

Carlos Manuel Duclos-Vergara

Schibsted, Norway

Choosing the Right Abstraction Level for Your Kafka Project

What kind of operations need to be applied to the events? Do we need to interact with external systems? In this presentation, he will go through several scenarios and cases to highlight the key factors that should be considered when deciding which API should be used for a given project.

Session Keywords
🔑 Streaming Architecture
🔑 Event Processing
🔑 Kafka

Daniel Wrigley

SHI, Germany

Keyword Search is Dead! And so are Solr and Elasticsearch?

How can AI combined with Vector Similarity Search efficiently deliver more relevant search results than conventional methods?
For which cases is there an economic gain from their application?
To answer these and other questions, he will provide an overview of the current state and an outlook into the future possibilities of new technologies and reveal how search applications can get a boost with the help of AI. 

Session Keywords
🔑 Natural Language Processing (NLP)
🔑 Vector Similarity Search
🔑 Elasticsearch
🔑 Solr

Paolo Dello Vicario

Datrix, Italy

NLP & Machine Learning Applied to the Analysis of Advertising Data and User Behaviour on the Website for Marketing Purposes

In this speech, we will understand how to use the events collected by behavioural tracking tools on websites in order to create a performant audience for advertising and marketing campaigns.

Session Keywords
🔑 Machine Learning
🔑 Marketing
🔑 NLP

Phil Winder

Winder Research, UK

A Code-Driven Introduction to Reinforcement Learning

This presentation investigates the state of the art in the cyber-security space, specifically focussing on how reinforcement learning is helping beat the hackers.

Session Keywords
🔑 Reinforcement Learning
🔑 Cyber Security

Einat Orr

Treeverse, Israel

Rethinking Ingestion: CI/CD for Data Lakes

What they propose and will cover in this talk, is a new strategy for data lake ingestion. One where new data can be added in isolation then tested and validated, before “going live” in a production table. Finally, they will show how git-for-data tools like lakeFS and Nessie enable this ingestion paradigm in a seamless way.

Session Keywords
🔑 Data Lake
🔑 Data Versioning
🔑 Ingestion

Gerard Toonstra

Datafold, The Netherlands

Data Observability

Data Observability is a growing area in data engineering. In this session, he will explain to an audience of data engineers what data observability means in both development and operational processes.

Session Keywords
🔑 Data Observability
🔑 Data Lineage
🔑 Catalog

Josef Habdank

DXC Luxoft, Denmark

Management of a Cloud Data Lake in Practice: How to Manage 1000s of ETLs Using Apache Spark

The talk will outline the business reasoning, key design principles as well as technical solution. Expect some (but not too much) nerdy details related to Apache Spark implementation.

Session Keywords
🔑 Data Governance
🔑 Azure
🔑 Spark

Karol Przystalski

Codete, Poland

Machine Learning Security

Many companies would like to introduce machine learning models, but fail to see the potential security issues. In the presentation, he will show recent security issues related to machine learning models, such as adversarial attacks.

Session Keywords
🔑 ML
🔑 Security