Popular "hadoop" questions | Page 6

What is the purpose of shuffling and sorting phase in the reducer in Map Reduce Programming?

In Map Reduce programming the reduce phase has shuffling, sorting and reduce as its sub-parts. Sorting is a costly affair. …

sorting hadoop mapreduce hdfs shuffle

Hive load CSV with commas in quoted fields

I am trying to load a CSV file into a Hive table like so: CREATE TABLE mytable ( num1 INT, text1 …

hadoop hbase hive hdfs delimiter

Apache Spark: The number of cores vs. the number of executors

I'm trying to understand the relationship of the number of cores and the number of executors when running a Spark …

hadoop apache-spark yarn

How to list all files in a directory and its subdirectories in hadoop hdfs

I have a folder in hdfs which has two subfolders each one has about 30 subfolders which,finally,each one contains …

java hadoop hdfs

Hive ParseException - cannot recognize input near 'end' 'string'

I am getting the following error when trying to create a Hive table from an existing DynamoDB table: NoViableAltException(88@[]) at …

hadoop mapreduce hive bigdata amazon-dynamodb

How to copy data from one HDFS to another HDFS?

I have two HDFS setup and want to copy (not migrate or move) some tables from HDFS1 to HDFS2. How …

hadoop hdfs bigdata sqoop

What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism?

What's the difference between spark.sql.shuffle.partitions and spark.default.parallelism? I have tried to set both of them …

performance apache-spark hadoop apache-spark-sql

Is there any way to get the column name along with the output while execute any query in Hive?

In Hive, when we do a query (like: select * from employee), we do not get any column names in the …

hadoop hive rdbms

What are the pros and cons of parquet format compared to other formats?

Characteristics of Apache Parquet are : Self-describing Columnar format Language-independent In comparison to Avro, Sequence Files, RC File etc. I want …

file hadoop hdfs avro parquet

Explode the Array of Struct in Hive

This is the below Hive Table CREATE EXTERNAL TABLE IF NOT EXISTS SampleTable ( USER_ID BIGINT, NEW_ITEM ARRAY<…

hadoop mapreduce hive hiveql

Top "Hadoop" questions