Top "Distributed-computing" questions

utilizing more than one computer, connected to each other with a communication link to accomplish a common task.

Spark - repartition() vs coalesce()

According to Learning Spark Keep in mind that repartitioning your data is a fairly expensive operation. Spark also has an …

apache-spark distributed-computing rdd
Explaining Apache ZooKeeper

I am trying to understand ZooKeeper, how it works and what it does. Is there any application which is comparable …

apache-zookeeper distributed-computing
What is the difference between cache and persist?

In terms of RDD persistence, what are the differences between cache() and persist() in spark ?

apache-spark distributed-computing rdd
What determines Kafka consumer offset?

I am relatively new to Kafka. I have done a bit of experimenting with it, but a few things are …

java apache-kafka kafka-consumer-api distributed-computing
What are workers, executors, cores in Spark Standalone cluster?

I read Cluster Mode Overview and I still can't understand the different processes in the Spark Standalone cluster and the …

apache-spark distributed-computing
pyspark : NameError: name 'spark' is not defined

I am copying the pyspark.ml example from the official document website: http://spark.apache.org/docs/latest/api/python/…

apache-spark machine-learning pyspark distributed-computing apache-spark-ml
Is it possible to add partitions to an existing topic in Kafka 0.8.2

I have a Kafka cluster running with 2 partitions. I was looking for a way to increase the partition count to 3. …

java apache-kafka distributed-computing
Flattening Rows in Spark

I am doing some testing for spark using scala. We usually read json files which needs to be manipulated like …

scala apache-spark apache-spark-sql distributed-computing
Difference between cloud computing and distributed computing?

I wanted to know about the difference about cloud computing and distributed computing. I read an article about cloud computing …

cloud distributed-computing
Search/Find a file and file content in Hadoop

I am currently working on a project using Hadoop DFS. I notice there is no search or find command in …

file filesystems hadoop distributed distributed-computing