Conference is over. See you next year!

ERNESTAS SYSOJEVAS

DATA MINER, Lithuania

ERNESTAS SYSOJEVAS

DATA MINER, Lithuania

Biography

Ernestas is Senior Trainer and Director in DATA MINER Company. His training carrier started 15 years ago in Lithuania as Certified Microsoft Trainer (MCT) with specialization in Relational Databases area. During these years, he delivered more than 300 Microsoft SQL Server courses. However, 5 years ago he decided to step from Relational Database word toward Big Data solutions.

Now he delivers official Cloudera Hadoop trainings not only locally in Lithuania, but in all Europe, from London to Moscow. As Big Data technology enthusiast, he often speaks or delivers workshop trainings in various IT conferences and events. In 2016 year, taking into account course attendees’ evaluations, Ernestas was awarded as best Certified Cloudera Hadoop trained in EMEA area (Europe, Middle East and Asia). He is one of inspirers and main organizers of devdays.lt, devopspro.lt, testcon.lt and bigdataconference.lt conferences.

Workshop

Essentials for Apache Hadoop and Hadoop Ecosystem

Learn how Apache Hadoop addresses the limitations of traditional computing, helps businesses overcome real challenges, and powers new types of big data analytics. This workshop introduces Apache Hadoop ecosystem and outlines how to prepare the data center and manage Hadoop in production.

Agenda

Why Hadoop is needed
The Hadoop’s architecture
What type of problems can be solved with Hadoop
What components are included in the Hadoop ecosystem
Core Hadoop: HDFS, MapReduce, and YARN
Data Integration: Flume, Kafka, and Sqoop
Data Processing: Spark
Data Analysis: Hive and Impala
Data Exploration: Cloudera Search
User Interface: Hue
Data Storage: HBase
Data Security: Sentry
Managing Hadoop

Course objectives

The theory behind the creation of Hadoop, the anatomy of Hadoop itself, and an overview of complimentary projects that make up the Hadoop ecosystem.

There are many components working together in the Apache Hadoop stack. By understanding how each functions, you gain more insight into Hadoop’s functionality in your own IT environment. We will go beyond the motivation for Apache Hadoop and will dissect the Hadoop Distributed File System (HDFS), MapReduce, and the general topology of a Hadoop cluster.

We will share several Hadoop use cases across a few industries including financial services, insurance, telecommunications, intelligence, and healthcare to learn how Apache Hadoop is used in the real world and will explore ways to use Apache Hadoop to harness Big Data and solve business problems in ways never before imaginable.

Various projects make up the Apache Hadoop ecosystem, and each improves data storage, management, interaction, and analysis in its own unique way. We will review Hive, Pig, Impala, Spark, Kafka, Flume, Sqoop, HBase and Oozie, how they function within the stack and how they help integrate Hadoop within the production environment.

It is critical to understand how Apache Hadoop will affect the current setup of the data center and to plan ahead. We will discuss what resources are required to deploy Hadoop, how to plan for cluster capacity, and how to staff for your Big Data strategy.

Once you have Hadoop implemented in your environment, what’s next? How do you get the most out of the technology while managing it on a daily basis? So, we will learn how to manage Hadoop in effective way.

Target audience

This introductory workshop is intended to all IT professionals with little knowledge about Hadoop and who are interested to know the theory behind the creation of Hadoop, the anatomy of Hadoop itself, and an overview of complimentary projects that make up the Hadoop ecosystem.

Course prerequisites

A personal computer with 16 GB of RAM (8 GB is minimum) for virtual machines. Please download the machine here and make sure it works.

« Back to Workshops List

DATE:
26 November, 2018

TIME:
10:00-17:30

VENUE:
Crowne Plaza Vilnius – M. K. Čiurlionio str. 84, Vilnius, Lithuania

LANGUAGE:
Lithuanian