Top "Aws-glue" questions

AWS Glue is a fully managed ETL (extract, transform, and load) service that can categorize your data, clean it, enrich it, and move it between various data stores.

AWS Glue pricing against AWS EMR

I am doing some pricing comparison between AWS Glue against AWS EMR so as to chose between EMR & Glue. …

amazon-web-services amazon-emr aws-glue
AWS Glue: crawler misinterprets timestamps as strings. GLUE ETL meant to convert strings to timestamps makes them NULL

I have been playing around with AWS Glue for some quick analytics by following the tutorial here While I have …

amazon-web-services amazon-s3 amazon-athena aws-glue
AWS Glue transform a struct into dynamicframe

I am a little new to AWSGlue. I am working on transform a raw cloudwatch json out into csv with …

python amazon-web-services aws-glue
Optional job parameter in AWS Glue?

How can I implement an optional parameter to an AWS Glue Job? I have created a job that currently have …

python amazon-web-services aws-glue
AWS Glue write parquet with partitions

I am able to write to parquet format and partitioned by a column like so: jobname = args['JOB_NAME'] #header …

amazon-web-services apache-spark pyspark aws-glue
AWS Glue: How to add a column with the source filename in the output?

Does anyone know of a way to add the source filename as a column in a Glue job? We created …

amazon-web-services apache-spark pyspark aws-glue
How to partition data by datetime in AWS Glue?

The current set-up: S3 location with json files. All files stored in the same location (no day/month/year structure). …

amazon-web-services etl aws-glue aws-glue-data-catalog
AWS Glue Access denied for crawler with administrator policy attached

I am trying to run a crawler across an s3 datastore in my account which contains two csv files. However, …

amazon-s3 aws-glue
glue job for redshift connection: "Unable to find suitable security group"

I'm trying to set up a AWS Glue job and make a connection to Redshift. I'm getting error when I …

python amazon-web-services jdbc amazon-redshift aws-glue