We want to set the aws
parameters that from code would be done via the SparkContext
:
sc.hadoopConfiguration.set("fs.s3a.access.key", vault.user)
sc.hadoopConfiguration.set("fs.s3a.secret.key", vault.key)
However we have a custom Spark launcher framework that requires all the custom Spark configurations to be done via --conf
parameters to the spark-submit
command line.
Is there a way to "notify" the SparkContext to set --conf
values to the hadoopConfiguration
and not to its general SparkConf
? Looking for something along the lines of
spark-submit --conf hadoop.fs.s3a.access.key $vault.user --conf hadoop.fs.s3a.access.key $vault.key
or
spark-submit --conf hadoopConfiguration.fs.s3a.access.key $vault.user --conf hadoopConfiguration.fs.s3a.access.key $vault.key
You need to prefix Hadoop configs with spark.hadoop.
in the command line (or SparkConf
object). For example:
spark.hadoop.fs.s3a.access.key=value