- Module 1: Microsoft R Server and R Client
- Module 2: Exploring Big Data
- Module 3: Visualizing Big Data
- Module 4: Processing Big Data
- Module 5: Parallelizing Analysis Operations
- Module 6: Creating and Evaluating Regression Models
- Module 7: Creating and Evaluating Partitioning Models
- Module 8: Processing Big Data in SQL Server and Hadoop
Module 1: Microsoft R Server and R Client – An overview of how Microsoft R Server and Microsoft R Client work.
- What is Microsoft R Server?
- Using Microsoft R client
- The Scale R functions
Module 2: Exploring Big Data - At the end of this module the student will be able to use the R Client with R Server to explore the big data held in the different data stores.
- Understanding Scale R data sources
- Reading data into an XDF object
- Summarizing data in an XDF object
Module 3: Visualizing Big Data- An introduction to how to visualize data by using graphs and plots.
- Visualizing In-memory data
- Visualizing big data
Module 4: Processing Big Data – An explanation of how to transform and clean big data sets.
- Transforming Big Data
- Managing datasets
Module 5: Parallelizing Analysis Operations - An explanation of how to implement options for splitting analysis jobs into parallel tasks.
- Using the RxLocalParallel compute context with rxExec
- Using the revoPema R package
Module 6: Creating and Evaluating Regression Models – A brief introduction to how to build and evaluate regression models generated from big data.
- Clustering Big Data
- Generating regression models and making predictions
Module 7: Creating and Evaluating Partitioning Models - An explanation of how to create and score partitioning models generated from big data.
- Creating partitioning models based on decision trees.
- Test partitioning models by making and comparing predictions
Module 8: Processing Big Data in SQL Server and Hadoop – An overview of how to transform and clean big data sets.
- Using R in SQL Server
- Using Hadoop Map/Reduce
- Using Hadoop Spark