YARN and MapReduce State Machine Diagrams and how to generate them

Last modified : 2 August, 2017

A lot of people don’t know, but it is very easy to generate State Machine Diagrams for any version of Hadoop. This is thanks to the work done by Binglin Chang in MAPREDUCE-2930. It is immensely helpful to understand all the state changes. Since it is derived from the code, it doesn’t get outdated. I love it and use it very often.

To generate the graphs, in the hadoop source root directory you can run mvn -Pvisualize compile

This generates a bunch of .gv files.

$ find . -name '\*.gv'
./MapReduce.gv
./NodeManager.gv
./ResourceManager.gv

The .gv files are easily converted to PNG using dot (for which you might have to install graphviz)

$ dot -Tpng MapReduce.gv > MapReduce.png
$ dot -Tpng NodeManager.gv > NodeManager.png
$ dot -Tpng ResourceManager.gv > ResourceManager.png

I’m going to leave these here for you to admire. These were generated from the source with git SHA 8ce8672b6b551dacb9467924fc70f88790f5891f . Please right click and “View Image” / save link to view them in your favorite image viewer.

ResourceManager State Diagram

ResourceManager State Diagram.

NodeManager State Diagram

NodeManager State Diagram.

Mapreduce State Diagram

MapReduce State Diagram.

All content on this website is licensed as Creative Commons-Attribution-ShareAlike 4.0 License. Opinions expressed are solely my own.