Top "Hadoop" questions

Hadoop is an Apache open-source project that provides software for reliable and scalable distributed computing.

Connect from Java to Hive using JDBC

I'm trying to connect from Java to Hive server 1. I found a question time ago in this forum but it …

java hadoop jdbc hive
How to overwrite the existing files using hadoop fs -copyToLocal command

Is there any way we can overwrite existing files, while coping from HDFS using: hadoop fs -copyToLocal <HDFS PATH&…

hadoop
Write to multiple outputs by key Spark - one Spark job

How can you write to multiple outputs dependent on the key using Spark in a single Job. Related: Write to …

scala hadoop output hdfs apache-spark
data block size in HDFS, why 64MB?

The default data block size of HDFS/hadoop is 64MB. The block size in disk is generally 4KB. What does 64…

database hadoop mapreduce block hdfs
Hadoop: Connecting to ResourceManager failed

After installing hadoop 2.2 and trying to launch pipes example ive got the folowing error (the same error shows up after …

hadoop yarn
How To Remove non-alpha numeric, or non-numeric characters with Hive REGEXP_EXTRACT() Function

I've been trying to figure out how to remove multiple non-alphanumeric or non-numeric characters, or return only the numeric characters …

regex hadoop hive etl
Reading HDFS and local files in Java

I want to read file paths irrespective of whether they are HDFS or local. Currently, I pass the local paths …

java hadoop mapreduce hdfs
Python read file as stream from HDFS

Here is my problem: I have a file in HDFS which can potentially be huge (=not enough to fit all …

python hadoop subprocess hdfs
Hive - month and year from timestamp column

Hi I am trying to extract the month and year part of a timestamp column in hive using the below …

date hadoop hive sql-timestamp
Where does Hadoop store the logs of YARN applications?

I run the basic example of Hortonworks' yarn application example. The application fails and I want to read the logs …

logging hadoop yarn