Spark – How to Run Spark Applications on Windows
A step-by-step, developer-friendly guide to running Apache Spark applications on Windows, including configuration, environment setup, and troubleshooting tips. —
Table of Contents
Introduction
Whether you want to unit test your Spark Scala application or run Spark jobs locally on Windows, you need to perform a few basic configurations. This guide will help you set up your environment so you can run Spark applications seamlessly on your Windows machine.
Why You Don’t Need Hadoop on Windows
You do not need a full Hadoop installation to run Spark on Windows. Spark uses POSIX-like file operations, which are implemented on Windows using winutils.exe and some Windows APIs.
Step 1: Download winutils.exe
- Download the
winutils.exe
binary from https://github.com/steveloughran/winutils - Place it in a folder, e.g.,
C:/hadoop/bin
- Make sure you download the version of
winutils.exe
that matches the Hadoop version your Spark distribution was compiled against- You can check the Hadoop version in the Spark binary’s POM file, e.g.: https://search.maven.org/artifact/org.apache.spark/spark-parent_2.11/2.4.4/pom
Step 2: Set HADOOP_HOME and PATH
Set the following environment variables, either via the Windows Control Panel (recommended, for all apps) or in your command prompt (for the current session only):
set HADOOP_HOME=C:/hadoop
set PATH=%HADOOP_HOME%/bin;%PATH%
HADOOP_HOME
should point to the directory containing thebin
folder withwinutils.exe
- Add
%HADOOP_HOME%/bin
to yourPATH
Running Spark Applications
Now you can run any Spark application on your local Windows machine using IntelliJ, Eclipse, or the spark-shell
.
- No Hadoop installation required
- Works for both development and unit testing
Troubleshooting
- winutils.exe not found: Double-check that
winutils.exe
is inC:/hadoop/bin
and thatHADOOP_HOME
andPATH
are set correctly - Version mismatch: Ensure the version of
winutils.exe
matches the Hadoop version your Spark build expects - Permissions errors: Run your IDE or terminal as Administrator if you encounter file permission issues