Top "Data-partitioning" questions

Data partitioning deals with the dividing of a collection of data into smaller collections of data for the purpose of faster processing, easier statistics gathering and smaller memory/persistence footprint.

Spark SQL - Difference between df.repartition and DataFrameWriter partitionBy?

What is the difference between DataFrame repartition() and DataFrameWriter partitionBy() methods? I hope both are used to "partition data based …

apache-spark-sql data-partitioning
C# - elegant way of partitioning a list?

I'd like to partition a list into a list of lists, by specifying the number of elements in each partition. …

c# list data-partitioning
QuickSort and Hoare Partition

I have a hard time translating QuickSort with Hoare partitioning into C code, and can't find out why. The code …

c algorithm sorting quicksort data-partitioning
Creating data partition in R

With caret package, when creating data partition 75% training and 25% test, we use: inTrain<- createDataPartition(y=spam$type,p=0.75, …

r partitioning r-caret data-partitioning
Querying Windows Azure Table Storage with multiple query criteria

I'm trying to query a table in Windows Azure storage and was initially using the TableQuery.CombineFilters in the TableQuery&…

azure azure-table-storage data-partitioning
How does createDataPartition function from caret package split data?

From the documentation: For bootstrap samples, simple random sampling is used. For other data splitting, the random sampling is done …

r subset r-caret data-partitioning
python equivalent of filter() getting two output lists (i.e. partition of a list)

Let's say I have a list, and a filtering function. Using something like >>> filter(lambda x: x &…

python filter data-partitioning
How many different partitions with exactly n parts can be made of a set with k-elements?

How many different partitions with exactly two parts can be made of the set {1,2,3,4}? There are 4 elements in this list …

math discrete-mathematics data-partitioning
How to partition a set of values (vector) in R

I'm programming in R. I've got a vector containing, let's say, 1000 values. Now let's say I want to partition these 1000 …

r vector set data-partitioning
Using jq how can I split a very large JSON file into multiple files, each a specific quantity of objects?

I have a large JSON file with I'm guessing 4 million objects. Each top level has a few levels nested inside. …

json jq data-partitioning