Top "Amazon-emr" questions

Amazon Elastic MapReduce (Amazon EMR) is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data.

pyspark error does not exist in the jvm error when initializing SparkContext

I am using spark over emr and writing a pyspark script, I am getting an error when trying to from …

python python-3.x apache-spark pyspark amazon-emr
Spark UI on AWS EMR

I am running a AWS EMR cluster with Spark (1.3.1) installed via the EMR console dropdown. Spark is current and processing …

apache-spark amazon-emr
How do I increase Tez's container physical memory?

I've been running some hive scripts on an aws emr 4.8 cluster with hive 1.0 and tez 0.8. My configurations look like this: …

hadoop hive amazon-emr apache-tez tez
How to add a jar in zeppelin?

How to add a jar in Zeppelin for %hive interpreter? I have tried %z.dep(''); add jar <jar …

json jar hive amazon-emr apache-zeppelin
Can we consider AWS Glue as a replacement for EMR?

Just a quick question to clarify from Masters, since AWS Glue as an ETL tool, can provide companies with benefits …

amazon-web-services etl amazon-emr aws-glue
Spark + EMR using Amazon's "maximizeResourceAllocation" setting does not use all cores/vcores

I'm running an EMR cluster (version emr-4.2.0) for Spark using the Amazon specific maximizeResourceAllocation flag as documented here. According to …

apache-spark yarn emr amazon-emr elastic-map-reduce
How to execute spark submit on amazon EMR from Lambda function?

I want to execute spark submit job on AWS EMR cluster based on the file upload event on S3. I …

amazon-web-services apache-spark aws-lambda amazon-emr spark-submit
Folder won't delete on Amazon S3

I'm trying to delete a folder created as a result of a MapReduce job. Other files in the bucket delete …

amazon-s3 amazon-web-services amazon-emr
How to launch and configure an EMR cluster using boto

I'm trying to launch a cluster and run a job all using boto. I find lot's of examples of creating …

python amazon-web-services boto amazon-emr
Use S3DistCp to copy file from S3 to EMR

I am struggling to find a way to use S3DistCp in my AWS EMR Cluster. Some old examples which …

amazon-s3 aws-sdk amazon-emr elastic-map-reduce s3distcp