Apache Spark 101 Tutorial | First Apache Spark Application using IntelliJ IDEA in Windows 7/10 | Part 1

Prerequisite

  • Windows 7/10 Operating System 
  • Java 8(JDK 1.8 or later version)

Walk-through

In this article, I am going to walk-through various steps to create first Apache Spark Application using IntelliJ IDEA in Windows 7/10.

Step 1: Download IntelliJ IDEA Community Edition

Search for "intellij idea download" from www.google.com



Click on "Intellij IDEA Download" link from search results



Click on "Download" button which is Community Edition installer(.exe) in the Windows tab







Click on "Show in folder"



Step 2: Installation of Intellij IDEA Community Edition

Double click on "ideaIC-2019.2.4.exe" to start the installation



Click on "Yes" button





Click on "Next" button to continue



Change the "Destination Folder" if you like to change else click on "Next" button to continue



Click on different check box if you wish to select those options or else click on "Next" button to continue



Click on "Install" button to continue









Click on "Run IntelliJ IDEA Community Edition" check box and click on "Finish" button to complete the installation.

Step 3: Configure IntelliJ IDEA Community Edition



Choose "Do not import settings" and click on "OK" button to configure IntelliJ IDEA Community Edition





Choose "Set UI theme" either "Darcula" or "Light"



Click on "Next: Default plugins" button to continue



Click on "Next: Featured plugins"



Click on "Install" button under the "Scala" Featured plugins





Click on "Start using IntelliJ IDEA" button to continue



Step 4: Create sbt based Scala project for Apache Spark Application

Click on "Create New Project" button



Choose "Scala" from left section/menu



Choose "sbt" from right section and click on "Next" button to continue



Provide project name as "apachespark101", JDK as 1.8(Java 8), sbt version as 1.3.2 and Scala version as 2.12.8



Click on "Finish" button to continue



Click on "Close" button to continue



Click on "Project" explorer and open the "build.sbt" file



Add the Apache Spark dependency in the "build.sbt"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.4"





Click on "sbt" tab from the right section





Click on "Refresh all sbt projects" button in the "sbt" tab from the right section





Expand the "src", "main" and "Scala"



Right click on the "Scala" folder and choose "New" and "Scala Class"



Provide class name "create_first_app_apachespark101_part_1" and choose "Object"

Note: Scala Object Class is created, hence no need to create object, have a main method and code in the main method can be executed directly.

Place the below code in the create_first_app_apachespark101_part_1.scala file

package com.datamaking.apachespark101

import org.apache.spark.sql.SparkSession

object create_first_app_apachespark101_part_1 {
  def main(args: Array[String]): Unit = {
    println("Started ...")
    println("First Apache Spark 2.4.4 Application using IntelliJ IDEA in Windows 7/10 | Apache Spark 101 Tutorial | Scala API | Part 1")

    val spark = SparkSession
      .builder
      .appName("Apache Spark 101 Tutorial | Part 1")
      .master("local[*]")
      .getOrCreate()

    spark.sparkContext.setLogLevel("ERROR")

    val tech_names_list = List("spark1", "spark2", "spark3", "hadoop1", "hadoop2", "spark4")
    val names_rdd = spark.sparkContext.parallelize(tech_names_list, 3)
    val names_upper_case_rdd = names_rdd.map(ele => ele.toUpperCase())
    names_upper_case_rdd.collect().foreach(println)

    spark.stop()
    println("Completed.")
  }
}










Run the Scala Object Class

Summary

In this article, we have successfully installed IntelliJ IDEA Community Edition and ran the first Apache Spark Application. Please post your feedback and queries if you have anything. Thank you.

Happy Learning !!!

Post a Comment

0 Comments