BIG DATA CONFERENCE

EUROPE 2022

November 23-24

Online

Ricardo Ferreira

Observability Lead, Community

Elastic, US

Biography

Ricardo is Senior Developer Advocate at AWS, working in the developer relations team for North America. With +20 years of experience, he may have learned a thing or two about distributed systems, fast data analytics, software architecture, databases, and observability. Before joining AWS, he worked for software vendors like Elastic, Confluent, and Oracle. Ricardo is known for his natural ability to explain complex topics. He craftily breaks them down into bite-sized pieces until anyone can understand.

While not working, he loves barbecuing in his backyard with his family and friends, where he finally gets the chance to talk about anything unrelated to computers. He currently lives in North Carolina, USA, with his wife and son.

Talk

Building Debugging-Enabled Data Pipelines

Like what is happening with everything else in the software world, building data pipelines are getting even more complex. Most data pipelines are a mix of various technologies, architecture styles, layers, and runtimes. This creates an enormous burden in data engineering teams building and maintaining data pipelines, as identifying and fixing issues looks more and more like one of those investigative TV shows where a detective searches for a culprit amongst many possible suspects.

But it doesn’t have to be this way. Observability technologies provide a handy way for data engineers to trace, troubleshoot, and fix data pipeline problems. This session will explain why this practice is important and its benefits. It will also show in practice how to apply this in a complex-enough data pipeline built using Apache Kafka, Debezium, MySQL, and ksqlDB.

Session Keywords

🔑 ETL

🔑 CDC

🔑 Tracing

🔑 Kafka

🔑 Pulsar

🔑 Flink

🔑 Streaming

« Back