public List<String> buildCommand(Map<String, String> env)
SparkSubmitCommandBuilder
Command Builder
SparkSubmitCommandBuilder
is used to build a command that spark-submit and SparkLauncher use to launch a Spark application.
SparkSubmitCommandBuilder
uses the first argument to distinguish between shells:
-
pyspark-shell-main
-
sparkr-shell-main
-
run-example
Caution
|
FIXME Describe run-example
|
SparkSubmitCommandBuilder
parses command-line arguments using OptionParser
(which is a SparkSubmitOptionParser). OptionParser
comes with the following methods:
-
handle
to handle the known options (see the table below). It sets upmaster
,deployMode
,propertiesFile
,conf
,mainClass
,sparkArgs
internal properties. -
handleUnknown
to handle unrecognized options that usually lead toUnrecognized option
error message. -
handleExtraArgs
to handle extra arguments that are considered a Spark application’s arguments.
Note
|
For spark-shell it assumes that the application arguments are after spark-submit 's arguments.
|
SparkSubmitCommandBuilder.buildCommand
/ buildSparkSubmitCommand
Note
|
buildCommand is a part of the AbstractCommandBuilder public API.
|
SparkSubmitCommandBuilder.buildCommand
simply passes calls on to buildSparkSubmitCommand private method (unless it was executed for pyspark
or sparkr
scripts which we are not interested in in this document).
buildSparkSubmitCommand
Internal Method
private List<String> buildSparkSubmitCommand(Map<String, String> env)
buildSparkSubmitCommand
starts by building so-called effective config. When in client mode, buildSparkSubmitCommand
adds spark.driver.extraClassPath to the result Spark command.
Note
|
Use spark-submit to have spark.driver.extraClassPath in effect.
|
buildSparkSubmitCommand
builds the first part of the Java command passing in the extra classpath (only for client
deploy mode).
Caution
|
FIXME Add isThriftServer case.
|
buildSparkSubmitCommand
appends SPARK_SUBMIT_OPTS
and SPARK_JAVA_OPTS
environment variables.
(only for client
deploy mode) …
Caution
|
FIXME Elaborate on the client deply mode case. |
addPermGenSizeOpt
case…elaborate
Caution
|
FIXME Elaborate on addPermGenSizeOpt
|
buildSparkSubmitCommand
appends org.apache.spark.deploy.SparkSubmit
and the command-line arguments (using buildSparkSubmitArgs).
buildSparkSubmitArgs
method
List<String> buildSparkSubmitArgs()
buildSparkSubmitArgs
builds a list of command-line arguments for spark-submit.
buildSparkSubmitArgs
uses a SparkSubmitOptionParser to add the command-line arguments that spark-submit
recognizes (when it is executed later on and uses the very same SparkSubmitOptionParser
parser to parse command-line arguments).
SparkSubmitCommandBuilder Property |
SparkSubmitOptionParser Attribute |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
getEffectiveConfig
Internal Method
Map<String, String> getEffectiveConfig()
getEffectiveConfig
internal method builds effectiveConfig
that is conf
with the Spark properties file loaded (using loadPropertiesFile internal method) skipping keys that have already been loaded (it happened when the command-line options were parsed in handle method).
Note
|
Command-line options (e.g. --driver-class-path ) have higher precedence than their corresponding Spark settings in a Spark properties file (e.g. spark.driver.extraClassPath ). You can therefore control the final settings by overriding Spark settings on command line using the command-line options.
charset and trims white spaces around values.
|
isClientMode
Internal Method
private boolean isClientMode(Map<String, String> userProps)
isClientMode
checks master
first (from the command-line options) and then spark.master
Spark property. Same with deployMode
and spark.submit.deployMode
.
Caution
|
FIXME Review master and deployMode . How are they set?
|
isClientMode
responds positive when no explicit master and client
deploy mode set explicitly.
OptionParser
OptionParser
is a custom SparkSubmitOptionParser that SparkSubmitCommandBuilder
uses to parse command-line arguments. It defines all the SparkSubmitOptionParser callbacks, i.e. handle, handleUnknown, and handleExtraArgs, for command-line argument handling.
OptionParser’s handle
Callback
boolean handle(String opt, String value)
OptionParser
comes with a custom handle
callback (from the SparkSubmitOptionParser callbacks).
Command-Line Option | Property / Behaviour |
---|---|
|
|
|
|
|
|
|
Sets |
|
Sets |
|
Sets |
|
Sets |
|
Expects a |
|
Sets It may also set |
|
Disables |
|
Disables |
|
Disables |
anything else |
Adds an element to |
OptionParser’s handleUnknown
Method
boolean handleUnknown(String opt)
If allowsMixedArguments
is enabled, handleUnknown
simply adds the input opt
to appArgs
and allows for further parsing of the argument list.
Caution
|
FIXME Where’s allowsMixedArguments enabled?
|
If isExample
is enabled, handleUnknown
sets mainClass
to be org.apache.spark.examples.[opt]
(unless the input opt
has already the package prefix) and stops further parsing of the argument list.
Caution
|
FIXME Where’s isExample enabled?
|
Otherwise, handleUnknown
sets appResource
and stops further parsing of the argument list.