Top "Bigdata" questions

Big data is a concept that deals with data sets of extreme volumes.

Calculating and saving space in PostgreSQL

I have a table in pg like so: CREATE TABLE t ( a BIGSERIAL NOT NULL, -- 8 b b SMALLINT, -- 2 …

postgresql database-design storage bigdata
Working with big data in python and numpy, not enough ram, how to save partial results on disc?

I am trying to implement algorithms for 1000-dimensional data with 200k+ datapoints in python. I want to use numpy, scipy, …

python arrays numpy scipy bigdata
How to get array/bag of elements from Hive group by operator?

I want to group by a given field and get the output with grouped fields. Below is an example of …

sql hadoop hive apache-pig bigdata
POC for Hadoop in real time scenario

I have a bit of a problem. I want to learn about Hadoop and how I might use it to …

hadoop real-time bigdata hadoop-streaming
How can I save an RDD into HDFS and later read it back?

I have an RDD whose elements are of type (Long, String). For some reason, I want to save the whole …

scala apache-spark hdfs rdd bigdata
Error Message: TOK_ALLCOLREF is not supported in current context - while Using DISTINCT in HIVE

I'm using the simple command: SELECT DISTINCT * FROM first_working_table; in HIVE 0.11, and I'm receiving the following error message: …

sql hadoop hive distinct bigdata
Is there maximum size of string data type in Hive?

Google a ton but haven't found it anywhere. Or does that mean Hive can support arbitrary large string data type …

hadoop hive bigdata
how to sort word count by value in hadoop?

hi i wanted to learn how to sort the word count by value in hadoop.i know hadoop takes of …

hadoop mapreduce bigdata partitioner
How do I determine the size of my HBase Tables ?. Is there any command to do so?

I have multiple tables on my Hbase shell that I would like to copy onto my file system. Some tables …

hadoop export hbase bigdata
Recommended package for very large dataset processing and machine learning in R

It seems like R is really designed to handle datasets that it can pull entirely into memory. What R packages …

r machine-learning signal-processing bigdata