I am currently working on the Queues assignment from Princeton's Algorithms Part I. One of the assignments is to implement a randomized queue. This is a question regarding the implementation and tradeoffs of using different data structures.
Question:
A randomized queue is similar to a stack or queue, except that the item removed is chosen uniformly at random from items in the data structure. Create a generic data type RandomizedQueue that implements the following API:
public class RandomizedQueue<Item> implements Iterable<Item> {
public RandomizedQueue() // construct an empty randomized queue
public boolean isEmpty() // is the queue empty?
public int size() // return the number of items on the queue
public void enqueue(Item item) // add the item
public Item dequeue() // remove and return a random item
public Item sample() // return (but do not remove) a random item
public Iterator<Item> iterator() // return an independent iterator over items in random order
public static void main(String[] args) // unit testing
}
The catch here is to implement the dequeue operation and the iterator since the dequeue removes and returns a random element and the iterator iterates over the queue in a random order.
1. Array Implementation:
The primary implementation I was considering is an array implementation. This will be identical to the implementation of an an array queue except the randomness.
Query 1.1: For the dequeue operation, I simply select a number randomly from the size of the array and return that item, and then move the last item in the array to the position of the returned item.
However, this approach changes the order of the queue. In this case it does not matter since I am dequeuing in random order. However, I was wondering if there was an time/memory efficient way to dequeue a random element from the array while preserving the order of the queue without having to create a new array and transfer all the data to it.
// Current code for dequeue - changes the order of the array after dequeue
private int[] queue; // array queue
private int N; // number of items in the queue
public Item dequeue() {
if (isEmpty()) throw NoSuchElementException("Queue is empty");
int randomIndex = StdRandom.uniform(N);
Item temp = queue[randomIndex]
if (randomIndex == N - 1) {
queue[randomIndex] = null; // to avoid loitering
} else {
queue[randomIndex] = queue[N - 1];
queue[randomIndex] = null;
}
// code to resize array
N--;
return temp;
}
Query 1.2: For the iterator to meet the requirement of returning elements randomly, I create a new array with all the indices of the queue, then shuffle the array with a Knuth shuffle operation and return the elements at those particular indices in the queue. However, this involves creating a new array equal to the length of the queue. Again, I am sure I am missing a more efficient method.
2. Inner Class Implementation
The second implementation involves an inner node class.
public class RandomizedQueue<Item> {
private static class Node<Item> {
Item item;
Node<Item> next;
Node<Item> previous;
}
}
Query 2.1. In this case I understand how to perform the dequeue operation efficiently: Return a random node and change the references for the adjacent nodes.
However, I am confounded by how to return a iterator that returns the nodes in random order without having to create a whole new queue with nodes attached in random order.
Further, what are the benefits of using such a data structure over an array, other than readability and ease of implementation?
This post is kind of long. I appreciate that you guys have taken the time to read my question and help me out. Thanks!
In your array implementation, your Query 1.1 seems to be the best way to do things. The only other way to remove a random element would be to move everything up to fill its spot. So if you had [1,2,3,4,5]
and you removed 2
, your code would move items 3, 4, and 5 up and you'd decrease the count. That will take, on average n/2 item moves for every removal. So removal is O(n). Bad.
If you won't be adding and removing items while iterating, then just use a Fisher-Yates shuffle on the existing array, and start returning items from front to back. There's no reason to make a copy. It really depends on your usage pattern. If you envision adding and removing items from the queue while you're iterating, then things get wonky if you don't make a copy.
With the linked list approach, the random dequeue operation is difficult to implement efficiently because in order to get to a random item, you have to traverse the list from the front. So if you have 100 items in the queue and you want to remove the 85th item, you'll have to start at the front and follow 85 links before you get to the one you want to remove. Since you're using a double-linked list, you could potentially cut that time in half by counting backwards from the end when the item to be removed is beyond the halfway point, but it's still horribly inefficient when the number of items in your queue is large. Imagine removing the 500,000th item from a queue of one million items.
For the random iterator, you can shuffle the linked list in-place before you start iterating. That takes O(n log n) time, but just O(1) extra space. Again, you have the problem of iterating at the same time you're adding or removing. If you want that ability, then you need to make a copy.