Top "Hadoop" questions

Hadoop is an Apache open-source project that provides software for reliable and scalable distributed computing.

Deleting file/folder from Hadoop

I'm running an EMR Activity inside a Data Pipeline analyzing log files and I get the following error when my …

hadoop amazon-web-services amazon-s3 elastic-map-reduce
Search/Find a file and file content in Hadoop

I am currently working on a project using Hadoop DFS. I notice there is no search or find command in …

file filesystems hadoop distributed distributed-computing
Hive installation issues: Hive metastore database is not initialized

I tried to install hive on a raspberry pi 2. I installed Hive by uncompress zipped Hive package and configure $HADOOP_…

hadoop installation hive derby
Parquet vs ORC vs ORC with Snappy

I am running a few tests on the storage formats available with Hive and using Parquet and ORC as major …

hadoop hive parquet snappy orc
Using Sqoop to import data from MySQL to Hive

I am using Sqoop (version 1.4.4) to import data from MySQL to Hive. The data will be a subset of one …

mysql hadoop hive sqoop
What should be hadoop.tmp.dir ?

Hadoop has configuration parameter hadoop.tmp.dir which, as per documentation, is `"A base for other temporary directories." I presume, …

hadoop hdfs config
COLLECT_SET() in Hive, keep duplicates?

Is there a way to keep the duplicates in a collected set in Hive, or simulate the sort of aggregate …

java hadoop user-defined-functions hive
How to specify username when putting files on HDFS from a remote machine

I have a Hadoop cluster setup and working under a common default username "user1". I want to put files into …

hadoop username hdfs
How to know location about partition in hive?

If I write a hive sql like ALTER TABLE tbl_name ADD PARTITION (dt=20131023) LOCATION 'hdfs://path/to/tbl_name/…

sql hadoop hive
LeaseExpiredException: No lease error on HDFS

I am trying to load large data to HDFS and I sometimes get the error below. any idea why? The …

hadoop hdfs