Popular "emr" questions | Page 2

I have run into a problem where I have Parquet data as daily chunks in S3 (in the form of …

apache-spark apache-spark-sql spark-dataframe emr parquet

I'm trying to run a (py)Spark job on EMR that will process a large amount of data. Currently my …

amazon-web-services apache-spark pyspark emr amazon-emr

I'm running an EMR cluster (version emr-4.2.0) for Spark using the Amazon specific maximizeResourceAllocation flag as documented here. According to …

apache-spark yarn emr amazon-emr elastic-map-reduce

I'm not able to locate error logs or message's from println calls in Scala while running jobs on Spark in …

scala apache-spark emr

I am running some machine learning algorithms on EMR Spark cluster. I am curious about which kind of instance to …

amazon-ec2 apache-spark emr

I'm getting this error, I tried to increase memory on cluster instances and in the executor and driver parameters without …

apache-spark yarn emr

I'm trying to maximize cluster usage for a simple task. Cluster is 1+2 x m3.xlarge, runnning Spark 1.3.1, Hadoop 2.4, Amazon AMI 3.7 …

apache-spark yarn emr

Does anyone know of a Scala SDK for Amazon Web Services? I am particularly interested in the EMR jobs.

scala amazon-web-services emr amazon-emr

I am trying to load a database with 1TB data to spark on AWS using the latest EMR. And the …

apache-spark yarn emr

I need to set a custom environment variable in EMR to be available when running a spark application. I have …

amazon-web-services hadoop apache-spark environment-variables emr

Top "Emr" questions