- Introduction and Summarization Patterns
- Filtering Patterns
- Data Organization Patterns
- Join Patterns
- Meta Patterns & Graph Patterns
- Input Output Pattern & Project Review
1. Introduction and Summarization Patterns
- Review of MapReduce
- Why are Design Patterns required for MapReduce
- Discussion of different classes of Design Patterns
- Discussion of project work and problem
- About Summarization Patterns
- Types of Summarization Patterns – Numerical Summarization Patterns
- Inverted Index Pattern and Counting with counters pattern
- Description, Applicability
- Structure (how mappers, combiners & reducers are used in this pattern) uses cases analogies to Pig & SLQ Performance Analysis
- Example code walk-through & data flow.
2. Filtering Patterns
- About Filtering Patterns
- Explain and Distinguish 4 different types of Filtering Patterns: Filtering Pattern, Bloom Filter Pattern
- Top Ten Pattern and Distinct Pattern
- Description
- Applicability
- Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis
3. Data Organization Patterns
- About Organization patterns
- Explain 5 different types of Organization Patterns – Structured to Hierarchical Pattern Partitioning Pattern
- Binning Pattern
- Total Order Sorting Pattern and Shuffling Pattern
- Description
- Applicability
- Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ
4. Join Patterns
- About Join Patterns
- Explain 4 different types of Join Patterns: Reduce Side Join Pattern
- Replicated Join Pattern
- Composite Join Pattern
- Cartesian Product Join Pattern
- Description
- Applicability
- Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ
5. Meta Patterns & Graph Patterns
- About Meta Patterns
- Types of Meta Patterns: Job Chaining – Description, use cases, chaining with driver, basic & parallel job chaining
- Chaining with shell scripts
- Chaining with job control
- Example code walk-through
- Chain Folding – Description,
- What to fold?
- Chain mapper
- Chain Reducer
- Example code walk-through
- Job Merging - Description
- Steps for merging two jobs,
- Example code walk-through
- Introduction to Graph design Pattern
- Types of Graph Design Patterns: In-mapper Combining Pattern, Schimmy Pattern and Range Partitioning Pattern Pseudo-code for each pattern applied to Page-rank algorithm.
6. Input Output Pattern & Project Review
- About Input Output Patterns
- Types of Input Output Patterns – Customizing Input & Output
- Generating Data
- External Source output
- External Source Input, Partition Pruning: Description
- Applicability
- Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ