Top "Parquet" questions

Apache Parquet is a columnar storage format for Hadoop.

How to read a Parquet file into Pandas DataFrame?

How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure …

python pandas parquet blaze
What are the pros and cons of parquet format compared to other formats?

Characteristics of Apache Parquet are : Self-describing Columnar format Language-independent In comparison to Avro, Sequence Files, RC File etc. I want …

file hadoop hdfs avro parquet
How to convert a csv file to parquet

I'm new to BigData.I need to convert a csv/txt file to Parquet format. I searched a lot but …

java parquet
Inspect Parquet from command line

How do I inspect the content of a Parquet file from the command line? The only option I see now …

parquet
Unable to infer schema when loading Parquet file

response = "mi_or_chd_5" outcome = sqlc.sql("""select eid,{response} as response from outcomes where {response} IS NOT NULL""".format(…

apache-spark pyspark parquet
Avro vs. Parquet

I'm planning to use one of the hadoop file format for my hadoop related project. I understand parquet is efficient …

hadoop avro parquet
How to view Apache Parquet file in Windows?

I couldn't find any plain English explanations regarding Apache Parquet files. Such as: What are they? Do I need Hadoop …

java .net parquet
How do I get schema / column names from parquet file?

I have a file stored in HDFS as part-m-00000.gz.parquet I've tried to run hdfs dfs -text dir/part-m-00000.…

hadoop apache-pig hdfs parquet
Parquet vs ORC vs ORC with Snappy

I am running a few tests on the storage formats available with Hive and using Parquet and ORC as major …

hadoop hive parquet snappy orc
Reading DataFrame from partitioned parquet file

How to read partitioned parquet with condition as dataframe, this works fine, val dataframe = sqlContext.read.parquet("file:///home/msoproj/…

scala apache-spark parquet spark-dataframe