spark.deploy.recoveryMode=ZOOKEEPER
spark.deploy.zookeeper.url=<zookeeper_host>:2181
spark.deploy.zookeeper.dir=/spark
Spark Standalone - Using ZooKeeper for High-Availability of Master
Tip
|
Read Recovery Mode to know the theory. |
You’re going to start two standalone Masters.
You’ll need 4 terminals (adjust addresses as needed):
Start ZooKeeper.
Create a configuration file ha.conf
with the content as follows:
Start the first standalone Master.
./sbin/start-master.sh -h localhost -p 7077 --webui-port 8080 --properties-file ha.conf
Start the second standalone Master.
Note
|
It is not possible to start another instance of standalone Master on the same machine using ./sbin/start-master.sh . The reason is that the script assumes one instance per machine only. We’re going to change the script to make it possible.
|
$ cp ./sbin/start-master{,-2}.sh
$ grep "CLASS 1" ./sbin/start-master-2.sh
"${SPARK_HOME}/sbin"/spark-daemon.sh start $CLASS 1 \
$ sed -i -e 's/CLASS 1/CLASS 2/' sbin/start-master-2.sh
$ grep "CLASS 1" ./sbin/start-master-2.sh
$ grep "CLASS 2" ./sbin/start-master-2.sh
"${SPARK_HOME}/sbin"/spark-daemon.sh start $CLASS 2 \
$ ./sbin/start-master-2.sh -h localhost -p 17077 --webui-port 18080 --properties-file ha.conf
You can check how many instances you’re currently running using jps
command as follows:
$ jps -lm
5024 sun.tools.jps.Jps -lm
4994 org.apache.spark.deploy.master.Master --ip japila.local --port 7077 --webui-port 8080 -h localhost -p 17077 --webui-port 18080 --properties-file ha.conf
4808 org.apache.spark.deploy.master.Master --ip japila.local --port 7077 --webui-port 8080 -h localhost -p 7077 --webui-port 8080 --properties-file ha.conf
4778 org.apache.zookeeper.server.quorum.QuorumPeerMain config/zookeeper.properties
Start a standalone Worker.
./sbin/start-slave.sh spark://localhost:7077,localhost:17077
Start Spark shell.
./bin/spark-shell --master spark://localhost:7077,localhost:17077
Wait till the Spark shell connects to an active standalone Master.
Find out which standalone Master is active (there can only be one). Kill it. Observe how the other standalone Master takes over and lets the Spark shell register with itself. Check out the master’s UI.
Optionally, kill the worker, make sure it goes away instantly in the active master’s logs.