public List<String> buildCommand(Map<String, String> env)
SparkSubmitCommandBuilder Command Builder
SparkSubmitCommandBuilder is used to build a command that spark-submit and SparkLauncher use to launch a Spark application.
SparkSubmitCommandBuilder uses the first argument to distinguish between shells:
-
pyspark-shell-main -
sparkr-shell-main -
run-example
|
Caution
|
FIXME Describe run-example
|
SparkSubmitCommandBuilder parses command-line arguments using OptionParser (which is a SparkSubmitOptionParser). OptionParser comes with the following methods:
-
handleto handle the known options (see the table below). It sets upmaster,deployMode,propertiesFile,conf,mainClass,sparkArgsinternal properties. -
handleUnknownto handle unrecognized options that usually lead toUnrecognized optionerror message. -
handleExtraArgsto handle extra arguments that are considered a Spark application’s arguments.
|
Note
|
For spark-shell it assumes that the application arguments are after spark-submit's arguments.
|
SparkSubmitCommandBuilder.buildCommand / buildSparkSubmitCommand
|
Note
|
buildCommand is a part of the AbstractCommandBuilder public API.
|
SparkSubmitCommandBuilder.buildCommand simply passes calls on to buildSparkSubmitCommand private method (unless it was executed for pyspark or sparkr scripts which we are not interested in in this document).
buildSparkSubmitCommand Internal Method
private List<String> buildSparkSubmitCommand(Map<String, String> env)
buildSparkSubmitCommand starts by building so-called effective config. When in client mode, buildSparkSubmitCommand adds spark.driver.extraClassPath to the result Spark command.
|
Note
|
Use spark-submit to have spark.driver.extraClassPath in effect.
|
buildSparkSubmitCommand builds the first part of the Java command passing in the extra classpath (only for client deploy mode).
|
Caution
|
FIXME Add isThriftServer case.
|
buildSparkSubmitCommand appends SPARK_SUBMIT_OPTS and SPARK_JAVA_OPTS environment variables.
(only for client deploy mode) …
|
Caution
|
FIXME Elaborate on the client deply mode case. |
addPermGenSizeOpt case…elaborate
|
Caution
|
FIXME Elaborate on addPermGenSizeOpt
|
buildSparkSubmitCommand appends org.apache.spark.deploy.SparkSubmit and the command-line arguments (using buildSparkSubmitArgs).
buildSparkSubmitArgs method
List<String> buildSparkSubmitArgs()
buildSparkSubmitArgs builds a list of command-line arguments for spark-submit.
buildSparkSubmitArgs uses a SparkSubmitOptionParser to add the command-line arguments that spark-submit recognizes (when it is executed later on and uses the very same SparkSubmitOptionParser parser to parse command-line arguments).
SparkSubmitCommandBuilder Property |
SparkSubmitOptionParser Attribute |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
getEffectiveConfig Internal Method
Map<String, String> getEffectiveConfig()
getEffectiveConfig internal method builds effectiveConfig that is conf with the Spark properties file loaded (using loadPropertiesFile internal method) skipping keys that have already been loaded (it happened when the command-line options were parsed in handle method).
|
Note
|
Command-line options (e.g. --driver-class-path) have higher precedence than their corresponding Spark settings in a Spark properties file (e.g. spark.driver.extraClassPath). You can therefore control the final settings by overriding Spark settings on command line using the command-line options.
charset and trims white spaces around values.
|
isClientMode Internal Method
private boolean isClientMode(Map<String, String> userProps)
isClientMode checks master first (from the command-line options) and then spark.master Spark property. Same with deployMode and spark.submit.deployMode.
|
Caution
|
FIXME Review master and deployMode. How are they set?
|
isClientMode responds positive when no explicit master and client deploy mode set explicitly.
OptionParser
OptionParser is a custom SparkSubmitOptionParser that SparkSubmitCommandBuilder uses to parse command-line arguments. It defines all the SparkSubmitOptionParser callbacks, i.e. handle, handleUnknown, and handleExtraArgs, for command-line argument handling.
OptionParser’s handle Callback
boolean handle(String opt, String value)
OptionParser comes with a custom handle callback (from the SparkSubmitOptionParser callbacks).
| Command-Line Option | Property / Behaviour |
|---|---|
|
|
|
|
|
|
|
Sets |
|
Sets |
|
Sets |
|
Sets |
|
Expects a |
|
Sets It may also set |
|
Disables |
|
Disables |
|
Disables |
anything else |
Adds an element to |
OptionParser’s handleUnknown Method
boolean handleUnknown(String opt)
If allowsMixedArguments is enabled, handleUnknown simply adds the input opt to appArgs and allows for further parsing of the argument list.
|
Caution
|
FIXME Where’s allowsMixedArguments enabled?
|
If isExample is enabled, handleUnknown sets mainClass to be org.apache.spark.examples.[opt] (unless the input opt has already the package prefix) and stops further parsing of the argument list.
|
Caution
|
FIXME Where’s isExample enabled?
|
Otherwise, handleUnknown sets appResource and stops further parsing of the argument list.