Top "Bigdata" questions

Big data is a concept that deals with data sets of extreme volumes.

Incremental PCA on big data

I just tried using the IncrementalPCA from sklearn.decomposition, but it threw a MemoryError just like the PCA and RandomizedPCA …

python scikit-learn bigdata hdf5 pca
How to load large table into tableau for data visualization?

I am able to connect tableau with my database but the table size is really large here. Everytime I try …

sql-server database bigdata tableau-api
How can I perform full outer joins of large data sets in R?

I am trying to do data analysis in R on a group of medium sized datasets. One of the analyses …

r bigdata outer-join sqldf ffbase
How big data is "Bigdata"?

How much amount of data does qualify to be categorised as Bigdata? With what size of data can one decide …

hadoop mapreduce bigdata
iPad - Parsing an extremely huge json - File (between 50 and 100 mb)

I'm trying to parse an extremely big json-File on an iPad. The filesize will vary between 50 and 100 mb (there is …

ios json ipad core-data bigdata
hadoop - How to kill a TEZ job started by hive?

Below is what I can find. But the problem is if we reuse jdbc hive session all the hive queries …

hadoop hive yarn tez bigdata
How to transform a categorical variable in Spark into a set of columns coded as {0,1}?

I'm trying to perform a logistic regression (LogisticRegressionWithLBFGS) with Spark MLlib (with Scala) on a dataset which contains categorical variables. …

scala apache-spark bigdata apache-spark-mllib categorical-data
Django + Postgres + Large Time Series

I am scoping out a project with large, mostly-uncompressible time series data, and wondering if Django + Postgres with raw SQL …

python django postgresql heroku bigdata
How to change sqoop metastore?

I am using sqoop 1.4.2 version. I am trying to change the sqoop metastore from default hsqldb to mysql. I have …

hadoop hive bigdata sqoop sqoop2