BIG DATA CONFERENCE

EUROPE 2020

November 24-26

Online

Daniel Wrigley

Senior Search & Analytics Consultant

Germany, SHI GmbH

Biography

Daniel Wrigley works as a Senior Search & Analytics Consultant. He mostly deals with search and big data applications with a strong focus on modern open source projects such as NiFi, Solr, Spark, Kafka or Zeppelin. His experience as a Solr trainer enabled him to co-author the first German book on Solr. His weakness for search and natural language processing originates from his computational linguistics studies at Ludwig-Maximilians-University Munich where he graduated in 2012.

Talk

Adding AI Cloud Services to Your On-Prem Data Workflows for NLP & Content Enrichment

Getting data out of data sources and into your favourite search engine or storage system is often one of the first challenges you face in any data project. There are some ETL tools out there that help you face that issue and consequently there are a lot of data pipelines that are run on-premise: Either based on open source software like Apache NiFi or commercial products. These tools usually offer a variety of transformations but are very limited when it comes to working with unstructured data and enriching that data.

However, establishing a process for analysing and enriching unstructured data can be extremely tedious: You need the right persons (e.g. machine learning engineers, data scientists, natural language processing experts), a lot of annotated training and test data, computational power for training machine learning models etc.

Cloud services to the rescue! In this presentation Daniel will show how on-premise data processing or indexing pipelines can be extended by cloud services to get more out of your unstructured data while bypassing all the above-mentioned challenges saving time and money.

Session Keywords

🔑 Cloud
🔑 Streaming
🔑 NLP
🔑 NiFi

« Back