Data Engineering
Using Hive and Impala with Hadoop
data engineering hive impala hadoop 22.03.2021 - 24.03.2021
MAMPU, Cyberjaya
Through instructor led discussion and interactive, hands on exercises, participants will navigate the Hadoop ecosystem, learning topics such as
- The features that Hive and Impala offer for data acquisition, storage, and analysis
- The fundamentals of Apache Hadoop and data ETL ( transform, load), ingestion, and processing with Hadoop tools
- How Hive and Impala improve productivity for typical analysis tasks
- Joining diverse datasets to gain valuable business insight
- Performing real time, complex queries on datasets
Course Objective
- Memahami ciri-ciri yang ditawarkan oleh Hive dan Impala untuk perolehan, penyimpanan dan analisis data;
- Memahami asas-asas Apache Hadoop dan ETL data, pengambilan dan pemprosesan dengan tools Hadoop;
- Memahami bagaimana Hive dan Impala dapat meningkatkan produktiviti dalam tugas-tugas analisis;
- Menggabungkan pelbagai set data untuk mendapatkan insight yang bernilai; dan
- Melaksanakan real-time dan kuiri yang kompleks pada set data.
Course Outline
- Hadoop Fundamentals
- Introduction to Hive and Impala
- Querying with Hive and Impala
- Data Management
- Data Storage and Performance
- Relational Data Analysis with Hive and Impala
- Working with Impala
- Analyzing Text and Complex Data with Hive
- Hive Optimization
- Extending Hive