Finding prime numbers with the Sieve of Eratosthenes (Originally: Is there a better way to prepare this array?)

eleven81 picture eleven81 · Feb 25, 2009 · Viewed 41.6k times · Source

Note: Version 2, below, uses the Sieve of Eratosthenes. There are several answers that helped with what I originally asked. I have chosen the Sieve of Eratosthenes method, implemented it, and changed the question title and tags appropriately. Thanks to everyone who helped!

Introduction

I wrote this fancy little method that generates an array of int containing the prime numbers less than the specified upper bound. It works very well, but I have a concern.

The Method

private static int [] generatePrimes(int max) {
    int [] temp = new int [max];
    temp [0] = 2;
    int index = 1;
    int prime = 1;
    boolean isPrime = false;
    while((prime += 2) <= max) {
        isPrime = true;
        for(int i = 0; i < index; i++) {
            if(prime % temp [i] == 0) {
                isPrime = false;
                break;
            }
        }
        if(isPrime) {
            temp [index++] = prime;
        }
    }
    int [] primes = new int [index];
    while(--index >= 0) {
        primes [index] = temp [index];
    }
    return primes;
}

My Concern

My concern is that I am creating an array that is far too large for the final number of elements the method will return. The trouble is that I don't know of a good way to correctly guess the number of prime numbers less than a specified number.

Focus

This is how the program uses the arrays. This is what I want to improve upon.

  1. I create a temporary array that is large enough to hold every number less than the limit.
  2. I generate the prime numbers, while keeping count of how many I have generated.
  3. I make a new array that is the right dimension to hold just the prime numbers.
  4. I copy each prime number from the huge array to the array of the correct dimension.
  5. I return the array of the correct dimension that holds just the prime numbers I generated.

Questions

  1. Can I copy the whole chunk (at once) of temp[] that has nonzero elements to primes[] without having to iterate through both arrays and copy the elements one by one?
  2. Are there any data structures that behave like an array of primitives that can grow as elements are added, rather than requiring a dimension upon instantiation? What is the performance penalty compared to using an array of primitives?

Version 2 (thanks to Jon Skeet):

private static int [] generatePrimes(int max) {
    int [] temp = new int [max];
    temp [0] = 2;
    int index = 1;
    int prime = 1;
    boolean isPrime = false;
    while((prime += 2) <= max) {
        isPrime = true;
        for(int i = 0; i < index; i++) {
            if(prime % temp [i] == 0) {
                isPrime = false;
                break;
            }
        }
        if(isPrime) {
            temp [index++] = prime;
        }
    }
    return Arrays.copyOfRange(temp, 0, index);
}

Version 3 (thanks to Paul Tomblin) which uses the Sieve of Erastosthenes:

private static int [] generatePrimes(int max) {
    boolean[] isComposite = new boolean[max + 1];
    for (int i = 2; i * i <= max; i++) {
        if (!isComposite [i]) {
            for (int j = i; i * j <= max; j++) {
                isComposite [i*j] = true;
            }
        }
    }
    int numPrimes = 0;
    for (int i = 2; i <= max; i++) {
        if (!isComposite [i]) numPrimes++;
    }
    int [] primes = new int [numPrimes];
    int index = 0;
    for (int i = 2; i <= max; i++) {
        if (!isComposite [i]) primes [index++] = i;
    }
    return primes;
}

Answer

Paul Tomblin picture Paul Tomblin · Feb 25, 2009

Your method of finding primes, by comparing every single element of the array with every possible factor is hideously inefficient. You can improve it immensely by doing a Sieve of Eratosthenes over the entire array at once. Besides doing far fewer comparisons, it also uses addition rather than division. Division is way slower.