Top "Partitioning" questions

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

How to define partitioning of DataFrame?

I've started using Spark SQL and DataFrames in Spark 1.4.0. I'm wanting to define a custom partitioner on DataFrames, in Scala, …

scala apache-spark dataframe apache-spark-sql partitioning
Efficient way to divide a list into lists of n size

I have an ArrayList, which I want to divide into smaller Lists of n size, and perform an operation on …

java arraylist partitioning
Is Zookeeper a must for Kafka?

In Kafka, I would like to use only a single broker, single topic and a single partition having one producer …

partitioning apache-zookeeper producer-consumer apache-kafka broker
In Oracle SQL, can I query a partition of a table instead of an entire table to make it run faster?

I would like to query a table with a million records for customers named 'FooBar' that have records dated on 7…

sql oracle select syntax partitioning
Pandas: Sampling a DataFrame

I'm trying to read a fairly large CSV file with Pandas and split it up into two random chunks, one …

python partitioning pandas
Handling very large data with mysql

Sorry for the long post! I have a database containing ~30 tables (InnoDB engine). Only two of these tables, namely, "transaction" …

mysql database performance indexing partitioning
Table with 80 million records and adding an index takes more than 18 hours (or forever)! Now what?

A short recap of what happened. I am working with 71 million records (not much compared to billions of records processed …

mysql database database-design partitioning
How to partition and write DataFrame in Spark without deleting partitions with no new data?

I am trying to save a DataFrame to HDFS in Parquet format using DataFrameWriter, partitioned by three column values, like …

apache-spark spark-dataframe partitioning parquet
Table partitioning using 2 columns

Is it possible to partition a table using 2 columns instead of only 1 for the partition function? Consider a table with 3 …

sql-server-2008 partitioning
How to select rows from partition in MySQL

I made partition my 300MB table and trying to make select query from p0 partition with this command SELECT * FROM …

mysql sql partitioning database-partitioning mysql-5.1