Popular "bigdata" questions | Page 7

As we are hearing often about apache zeppelin, So few questions comes to our mind: What is Apache zeppelin? What …

apache-spark bigdata apache-zeppelin

I am working on a use case where I have to transfer data from RDBMS to HDFS. We have done …

hadoop apache-spark-sql sqoop bigdata

Im trying to play with the reddit data on bigquery and I want to see comments and replies in one …

sql subquery google-bigquery reddit bigdata

I can't understand reduceByKey(_ + _) in the first example of spark with scala object WordCount { def main(args: Array[String]): Unit = { …

scala apache-spark word-count bigdata

Let say we have a table with 6 million records. There are 16 integer columns and few text column. It is read-only …

arrays performance postgresql join bigdata

I have some expirience with Apache Spark and Spark-SQL. Recently I've found Apache Drill project. Could you describe me what …

hadoop apache-spark bigdata apache-drill

I'm trying to create an internal (managed) table in hive that can store my incremental log data. The table goes …

hadoop hive loaddata bigdata

I am using RStudio 0.97.320 (R 2.15.3) on Amazon EC2. My data frame has 200k rows and 12 columns. I am trying to …

performance r bigdata

Latest build of Hadoop provides mapred-site.xml.template Do we need to create a new mapred-site.xml file using this? …

apache hadoop mapreduce bigdata

I have a set of n (~1000000) strings (DNA sequences) stored in a list trans. I have to find the minimum …

python algorithm bigdata hamming-distance

Top "Bigdata" questions