Module 1: Introduction to Hadoop and HBase
- Introducing Hadoop
- Core Hadoop Components
- What Is HBase?
- Why Use HBase?
- Strengths of HBase
- HBase in Production
- Weaknesses of HBase
Module 2: HBase Tables
- HBase Concepts
- HBase Table Fundamentals
- Thinking About Table Design
Module 3: HBase Shell
- Creating Tables with the HBase Shell
- Working with Tables
- Working with Table Data
Module 4: HBase Architecture Fundamentals
- HBase Regions
- HBase Cluster Architecture
- HBase and HDFS Data Locality
Module 5: HBase Schema Design
- General Design Considerations
- Application-Centric Design
- Designing HBase Row Keys
- Other HBase Table Features
Module 6: Basic Data Access with the HBase API
- Options to Access HBase Data
- Creating and Deleting HBase Tables
- Retrieving Data with Get
- Retrieving Data with Scan
- Inserting and Updating Data
- Deleting Data
Module 7: More Advanced HBase API Features
- Filtering Scans
- Best Practices
- HBase Coprocessors
Module 8: HBase Write Path
- HBase Write Path
- Compaction
- Splits
Module 9: HBase Read Path
- How HBase Reads Data
- Block Caches for Reading
Module 10: HBase Performance Tuning
- Column Family Considerations
- Schema Design Considerations
- Configuring for Caching
- Memory Considerations
- Dealing with Time Series and Sequential Data
- Pre-Splitting Regions
Module 11: HBase Administration and Cluster Management
- HBase Daemons
- ZooKeeper Considerations
- HBase High Availability
- Using the HBase Balancer
- Fixing Tables with hbck
- HBase Security
Module 12: HBase Replication and Backup
- HBase Replication
- HBase Backup
- MapReduce and HBase Clusters
Module 13: Using Hive and Impala with HBase
- How to Use Hive and Impala to Access HBase