Course Contents of Spark

•  What is Apache Spark?

•  Apache Spark Architecture?

• Apache Spark Installations & It’s basics

•  Apache Spark installations in local machine

•  Apache Spark Conf and spark context.

•  RDD creations

•  Operations on RDD

•  RDD transformations functions

•  Map & Flat Map difference

•  Actions on RDD

•  Persistence

•  Introductions to Scala:

  Why Scala

•  Values.

  Variables

  Basic oops concepts<

  Scala Flow Control

  Functions

  Anonymous Functions.

•  Curried functions

>  Classes

h5>  Constructor

>  Abstract Classes

>  Traits

> Collections

• Spark Advance Operations:

 Working with RDD key-value

 RDD joins

 Shared Variables.

 Broadcast variables.

 Accumulators.

Apache Spark Internal architecture and Internal Execution Flow:

 Apache Spark Cluster Details.

 Apache Spark Standalone Mode cluster.

 Running Spark application in Standalone mode cluster.

 Summary of RDD sizes and memory usage.

 Spark web UI.

 Apache Spark internals.

 Apache spark execution flow.

 DAG: Logical graph of RDD operations

 RDD Physical Plan.

 Tasks.

 Stages.

 Scheduler.

 Types of RDD transformations

Spark SQL :

 SQL Context.

 Data Frame Creations.

 Creating temporary tables

 Parquet tables.

 Loading and processing csv file.

 Loading and processing json File.

 Writing data to local file system.

Spark Streaming:

 What is streaming?

 Why streaming?

 Discretized Streams.

 Transformations on DStreams

 Streaming Example

Real Time Machine learning use cases.

 Real time machine learning text classification using Naive Bayes classifier.

 Building real time machine learning recommendation engine.

Phone: 9900282636
Phone: 9900284626
No # 3, Groud Floor, V R K H Building
Vivekananda Layout
Opposite to Home town
Beside Biriyani Zone
Bangalore - 560 037,Karnataka, India.