Sum-subset with a fixed subset size

PengOne picture PengOne · Jan 18, 2012 · Viewed 19.1k times · Source

The sum-subset problem states:

Given a set of integers, is there a non-empty subset whose sum is zero?

This problem is NP-complete in general. I'm curious if the complexity of this slight variant is known:

Given a set of integers, is there a subset of size k whose sum is zero?

For example, if k = 1, you can do a binary search to find the answer in O(log n). If k = 2, then you can get it down to O(n log n) (e.g. see Find a pair of elements from an array whose sum equals a given number). If k = 3, then you can do O(n^2) (e.g. see Finding three elements in an array whose sum is closest to a given number).

Is there a known bound that can be placed on this problem as a function of k?

As motivation, I was thinking about this question How do you partition an array into 2 parts such that the two parts have equal average? and trying to determine if it is actually NP-complete. The answer lies in whether or not there is a formula as described above.

Barring a general solution, I'd be very interested in knowing an optimal bound for k=4.

Answer

Evgeny Kluev picture Evgeny Kluev · Jan 19, 2012

For k=4, space complexity O(n), time complexity O(n2 * log(n))

Sort the array. Starting from 2 smallest and 2 largest elements, calculate all lesser sums of 2 elements (a[i] + a[j]) in the non-decreasing order and all greater sums of 2 elements (a[k] + a[l]) in the non-increasing order. Increase lesser sum if total sum is less than zero, decrease greater one if total sum is greater than zero, stop when total sum is zero (success) or a[i] + a[j] > a[k] + a[l] (failure).

The trick is to iterate through all the indexes i and j in such a way, that (a[i] + a[j]) will never decrease. And for k and l, (a[k] + a[l]) should never increase. A priority queue helps to do this:

  1. Put key=(a[i] + a[j]), value=(i = 0, j = 1) to priority queue.
  2. Pop (sum, i, j) from priority queue.
  3. Use sum in the above algorithm.
  4. Put (a[i+1] + a[j]), i+1, j and (a[i] + a[j+1]), i, j+1 to priority queue only if these elements were not already used. To keep track of used elements, maintain an array of maximal used 'j' for each 'i'. It is enough to use only values for 'j', that are greater, than 'i'.
  5. Continue from step 2.

For k>4

If space complexity is limited to O(n), I cannot find anything better, than use brute force for k-4 values and the above algorithm for the remaining 4 values. Time complexity O(n(k-2) * log(n)).

For very large k integer linear programming may give some improvement.

Update

If n is very large (on the same order as maximum integer value), it is possible to implement O(1) priority queue, improving complexities to O(n2) and O(n(k-2)).

If n >= k * INT_MAX, different algorithm with O(n) space complexity is possible. Precalculate a bitset for all possible sums of k/2 values. And use it to check sums of other k/2 values. Time complexity is O(n(ceil(k/2))).