Top "Elastic-map-reduce" questions

Amazon Elastic MapReduce (Amazon EMR) is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data.

Deleting file/folder from Hadoop

I'm running an EMR Activity inside a Data Pipeline analyzing log files and I get the following error when my …

hadoop amazon-web-services amazon-s3 elastic-map-reduce
Backup AWS Dynamodb to S3

It has been suggested on Amazon docs http://aws.amazon.com/dynamodb/ among other places, that you can backup your …

amazon-s3 backup amazon-dynamodb elastic-map-reduce
Exporting Hive Table to a S3 bucket

I've created a Hive Table through an Elastic MapReduce interactive session and populated it from a CSV file like this: …

amazon-s3 hive elastic-map-reduce emr
Drop all partitions from a hive table?

How can I drop all partitions currently loaded in a Hive table? I can drop a single partition with alter …

hive elastic-map-reduce
Spark + EMR using Amazon's "maximizeResourceAllocation" setting does not use all cores/vcores

I'm running an EMR cluster (version emr-4.2.0) for Spark using the Amazon specific maximizeResourceAllocation flag as documented here. According to …

apache-spark yarn emr amazon-emr elastic-map-reduce
Scheduling A Job on AWS EC2

I have a website running on AWS EC2. I need to create a nightly job that generates a sitemap file …

amazon-ec2 amazon-web-services cron jobs elastic-map-reduce
Use S3DistCp to copy file from S3 to EMR

I am struggling to find a way to use S3DistCp in my AWS EMR Cluster. Some old examples which …

amazon-s3 aws-sdk amazon-emr elastic-map-reduce s3distcp
Loading data with Hive, S3, EMR, and Recover Partitions

SOLVED: See Update #2 below for the 'solution' to this issue. ~~~~~~~ In s3, I have some log*.gz files stored in …

hadoop amazon-s3 amazon-web-services hive elastic-map-reduce
Why does Yarn on EMR not allocate all nodes to running Spark jobs?

I'm running a job on Apache Spark on Amazon Elastic Map Reduce (EMR). Currently I'm running on emr-4.1.0 which includes …

apache-spark yarn emr amazon-emr elastic-map-reduce