BIG DATA CONFERENCE
Vilnius and Online
Senior Search & Analytics Consultant
Germany, SHI GmbH
Daniel Wrigley works as a Senior Search & Analytics Consultant. He mostly deals with search and big data applications with a strong focus on modern open source projects such as NiFi, Solr, Spark, Kafka or Zeppelin. His experience as a Solr trainer enabled him to co-author the first German book on Solr. His weakness for search and natural language processing originates from his computational linguistics studies at Ludwig-Maximilians-University Munich where he graduated in 2012.
Adding AI Cloud Services to Your On-Prem Data Workflows for NLP & Content Enrichment
Getting data out of data sources and into your favourite search engine or storage system is often one of the first challenges you face in any data project. There are some ETL tools out there that help you face that issue and consequently there are a lot of data pipelines that are run on-premise: Either based on open source software like Apache NiFi or commercial products. These tools usually offer a variety of transformations but are very limited when it comes to working with unstructured data and enriching that data.
However, establishing a process for analysing and enriching unstructured data can be extremely tedious: You need the right persons (e.g. machine learning engineers, data scientists, natural language processing experts), a lot of annotated training and test data, computational power for training machine learning models etc.
Cloud services to the rescue! In this presentation Daniel will show how on-premise data processing or indexing pipelines can be extended by cloud services to get more out of your unstructured data while bypassing all the above-mentioned challenges saving time and money.