Popular "parquet" questions | Page 2

I know we can load parquet file using Spark SQL and using Impala but wondering if we can do the …

hadoop hive apache-spark-sql hiveql parquet

I am trying to save a DataFrame to HDFS in Parquet format using DataFrameWriter, partitioned by three column values, like …

apache-spark spark-dataframe partitioning parquet

I looking for ways to read data from multiple partitioned directories from s3 using python. data_folder/serial_number=1/cur_…

python parquet pyarrow fastparquet python-s3fs

I am trying to convert a .csv file to a .parquet file. The csv file (Temp.csv) has the following …

python csv parquet

I need to read parquet files from multiple paths that are not parent or child directories. for example, dir1 --- | …

pyspark parquet

Is there a way to create parquet files from java? I have data in memory (java classes) and I want …

java parquet

Currently we are using Avro data format in production. Out of several good points using Avro, we know that it …

apache-spark hadoop data-warehouse avro parquet

In spark, what is the best way to control file size of the output file. For example, in log4j, …

apache-spark parquet

I have a DataFrame generated as follows: df.groupBy($"Hour", $"Category") .agg(sum($"value").alias("TotalValue")) .sort($"Hour".asc,$"TotalValue".…

scala apache-spark apache-spark-sql spark-dataframe parquet

I'd like to process Apache Parquet files (in my case, generated in Spark) in the R programming language. Is an …

r apache-spark parquet sparkr

Top "Parquet" questions