Popular "amazon-emr" questions | Page 3

I am creating a job to parse massive amounts of server data, and then upload it into a Redshift database. …

python amazon-s3 apache-spark pyspark amazon-emr

We are running spark 2.3.0 on AWS EMR. The following DataFrame "df" is non empty and of modest size: scala> …

apache-spark amazon-emr

After running a spark job on an Amazon EMR cluster, I deleted the output files directly from s3 and tried …

amazon-s3 pyspark amazon-emr

Currently I have a HIVE 0.7 instance on Amazon EMR. I am trying to create a duplicate of this instance on …

hadoop hive hdfs amazon-emr external-tables

I'm trying to install the pyarrow on a master instance of my EMR cluster, however I'm always receiving this error. […

python-3.x cmake pip amazon-emr pyarrow

I have a large (about 85 GB compressed) gzipped file from s3 that I am trying to process with Spark on …

apache-spark gzip amazon-emr

I am using latest AWS Hive version 0.13.0. FAILED: ParseException: cannot recognize input near 'exchange' 'string' ',' in column specification …

hadoop amazon-web-services hive amazon-emr hadoop-partitioning

Does anyone know of a Scala SDK for Amazon Web Services? I am particularly interested in the EMR jobs.

scala amazon-web-services emr amazon-emr

Im using the CLI for AWS to create a cluster and use the parameters from a json file. Here is …

amazon-web-services aws-cli amazon-emr

I want to be able to create EMR clusters, and for those clusters to send messages back to some central …

amazon-web-services hadoop amazon-emr

Top "Amazon-emr" questions