Top "Orc" questions

The Optimized Row Columnar (ORC) file format provides a highly efficient way to store Hive data.

Parquet vs ORC vs ORC with Snappy

I am running a few tests on the storage formats available with Hive and using Parquet and ORC as major …

hadoop hive parquet snappy orc
Aggregating multiple columns with custom function in Spark

I was wondering if there is some way to specify a custom aggregation function for spark dataframes over multiple columns. …

scala apache-spark dataframe apache-spark-sql orc
Spark: Save Dataframe in ORC format

In the previous version, we used to have a 'saveAsOrcFile()' method on RDD. This is now gone! How do …

scala apache-spark apache-spark-sql orc
How to read an ORC file stored locally in Python Pandas?

Can I think of an ORC file as similar to a CSV file with column headings and row labels containing …

python pandas pyspark data-science orc
Reading an ORC file in Java

How do you read an ORC file in Java? I'm wanting to read in a small file for some unit …

java hadoop orc
CTAS with Dynamic Partition

I want to change an existing table, that contains text format, into orc format. I was able to do it …

hive partition orc
Hadoop ORC file - How it works - How to fetch metadata

I am new to ORC file. I went through many blogs, but didn't get clear understanding. Please help and clarify …

hadoop hive file-format orc