Reasons for using a Bag in Java

Flika205 picture Flika205 · Apr 15, 2017 · Viewed 8.3k times · Source

I am currently studying about Algorithms & Data Structures and while I was reading over the Book of Algorithms 4th edition, I discovered the Bag data-structure together with the Stack and Queue. After reading the the explanation of it, it is still unclear to me what would I prefer using a Bag (which has no remove() method) over other data-structures such as Stack, Queue, LinkedList or a Set? As far as I can understand from the Book, the implementation of a Bag, is the same as for a Stack, just replacing the name of push() to add() and remove the pop() method.

So the idea of a Bag is basically having the ability to collect items and then iterate through the collected items, check if a bag is empty and find the number of items in it. But under which circumstances I would better using a Bag over one of the mentioned above Collections? And why a Bag doesn't have a remove() method basically? is there a specific reason for it?

Thanks in advance.

Answer

matoni picture matoni · Apr 15, 2017

Stack is ADT of the collection of elements with specific remove order = LIFO (last-in-first-out), allows duplicates,

Queue is ADT of the collection of elements with specific remove order = FIFO (first-in-first-out), allows duplicates,

LinkedList is implementation of the list,

Set is ADT of the collection of elements which disallows duplicates,

Bag is ADT of the collection of elements which allows duplicates.

In general, anything that holds an elements is Collection. Any collection which allows duplicates is Bag, otherwise it is Set. Any bag which access elements via index is List. Bag which appends new element after the last one and has a method to remove element from the head (first index) is Queue. Bag which appends new element after the last one and has a method to remove element from the tail (last index) is Stack.

Example: In Java, LinkedList is a collection, bag, list, queue and also you can work with it as it was a stack since it support stack operations (add~addLast~push, peekLast, removeLast~pop), so you can call it also stack. The reason, why it does not implement Stack interface is, that peek method is reserved by Queue implementation which retrieves the head of the list (first element). Therefore in case of LinkedList, the "stack methods" are derived from Deque.

Whether Bag contains remove(Object) or not may depend on the implementation e. g. you can implement your own Bag type which supports this operation. Also you can implement get(int) operation to access object on specified index. Time complexity of the get(int) would depend on your implementation e. g. one can implement Bag via linked-list so the complexity would be at average O(n/2), other one via resizable array (array-list) with direct access to the element via index, so the complexity would be O(1).

But the main idea of the Bag is, that it allows duplicates and iteration through this collection. Whether it supports another useful operations depends on implementator's design decision.

Which one of the collection type to use dependes on your needs, if duplicates are not desired, you would use Set instead of Bag. Moreover, if you care about remove order you would pick Stack or Queue which are basically Bags with specific remove order. You can think of Bag as super-type of the Stack and Queue which extends its api by specific operations.

Most of the time, you just need to collect objects and process them in some way (iteration + element processing). So you will use the most simple Bag implementation which is one directional linked-list.