Top "Parquet" questions

Apache Parquet is a columnar storage format for Hadoop.

Reading specific partitions from a partitioned parquet dataset with pyarrow

I have a somewhat large (~20 GB) partitioned dataset in parquet format. I would like to read specific partitions from the …

python parquet pyarrow apache-arrow
Offloading data files from Amazon Redshift to Amazon S3 in Parquet format

I would like to unload data files from Amazon Redshift to Amazon S3 in Apache Parquet format inorder to query …

amazon-redshift parquet amazon-athena amazon-redshift-spectrum
Parquet vs Cassandra using Spark and DataFrames

I have come to this dilemma that I cannot choose what solution is going to be better for me. I …

apache-spark cassandra spark-dataframe parquet
Read parquet data from AWS s3 bucket

I need read parquet data from aws s3. If I use aws sdk for this I can get inputstream like …

java amazon-web-services amazon-s3 parquet
Hive doesn't read partitioned parquet files generated by Spark

I'm having a problem to read partitioned parquet files generated by Spark in Hive. I'm able to create the external …

apache-spark hive partitioning partition parquet
Pandas : Reading first n rows from parquet file?

I have a parquet file and I want to read first n rows from the file into a pandas data …

python pandas parquet