What does Skipped Stage means in Spark WebUI ?


Skipped Stages in Spark UI

You must have come across various scenarios where you see a DAG like below, where you see a few stages shows greyed out with a text (skipped) after the stage name. What does this mean? Did Spark ignore one of your stage due to an error? or this is due to something else?

Well, it’s actually a good thing. It means that particular stage in the lineage DAG doesn’t need to be re-evaluated as its already evaluated and cached. This will save computation time for that stage. If you want to see what data frame for that stage was stored in the cache, you can check in Storage tab in Spark UI.

spark webui

Skipped Stages in Spark Web UI

 

Reference

https://stackoverflow.com/questions/34580662/what-does-stage-skipped-mean-in-apache-spark-web-ui

https://github.com/apache/spark/pull/3009

More Spark Articles – 

What is RDD in Spark ? and Why do we need it ?

How to Configure Spark Application ( Scala and Java 8 Version with Maven ) in Eclipse.


You may also like...

%d bloggers like this: