Set Collection for mutable objects in Java

Jadiel de Armas picture Jadiel de Armas · Feb 20, 2015 · Viewed 7.2k times · Source

In Java, a set only checks for equality of an object with objects already in the set only at insertion time. That means that if after an object is already present in the set, it becomes equal to another object in the set, the set will keep both equal objects without complaining.

EDIT: For example, consider a simple object, and assume that hashCode and equals are defined following best practices/

class Foo {
    int foo;

    Foo(int a){ foo = a; }
    //+ equals and hashcode based on "foo"
}

Foo foo1 = new Foo(1);
Foo foo2 = new Foo(2);
Set<Foo> set = new HashSet<Foo>();
set.add(foo1);
set.add(foo2);
//Here the set has two unequal elements.
foo2.foo = 1;
//At this point, foo2 is equal to foo1, but is still in the set 
//together with foo1.

How could a set class could be designed for mutable objects? The behavior I would expected would be the following: If at any time one of the objects in the set becomes equal to another object in the set, that object is deleted from the set by the set. Is there one already? Is there a programming language that would make this easier to accomplish?

Answer

HSquirrel picture HSquirrel · Feb 20, 2015

I don't think this can be reliably done in Java in the general sense. There is no general mechanism for ensuring a certain action on mutation of an object.

There a few approaches for solutions that may be sufficient for your use case.

1. Observe elements for changes

  • You need to have control over the implementation of the types that go into the set
  • Small performance cost whenever an object in your set updates

You could try to enforce an observer like construction where your Set class is registered as an Observer to all its items. This implies you'd need to control the types of objects that can be put into the Set (only Observable objects). Furthermore, you'd need to ensure that the Observables notify the observer for every change that can affect hashcode and equals. I don't know of any class like this that exists already. Like Ray mentions below, you'll need to watch out for potential concurrency problems as well. Example:

package collectiontests.observer;

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Observable;
import java.util.Observer;
import java.util.Set;

public class ChangeDetectingSet<E extends Observable> implements Set<E>, Observer {

    private HashSet<E> innerSet;

    public void update(Observable o, Object arg) {
        innerSet.remove(o);
        innerSet.add((E)o); 
    }
    public int size() {
        return innerSet.size();
    }
    public boolean isEmpty() {
        return innerSet.isEmpty();
    }
    public boolean contains(Object o) {
        return innerSet.contains(o);
    }
    public Iterator<E> iterator() {
        return innerSet.iterator();
    }
    public Object[] toArray() {
        return innerSet.toArray();
    }
    public <T> T[] toArray(T[] a) {
        return innerSet.toArray(a);
    }
    public boolean add(E e) {
        e.addObserver(this);
        return innerSet.add(e);
    }
    public boolean remove(Object o) {
        if(o instanceof Observable){
            ((Observable) o).deleteObserver(this);
        }
        return innerSet.remove(o);
    }
    public boolean containsAll(Collection<?> c) {
        return innerSet.containsAll(c);
    }
    public boolean addAll(Collection<? extends E> c) {
        boolean result = false;
        for(E el: c){
            result = result || add(el);
        }
        return result;
    }
    public boolean retainAll(Collection<?> c) {
        Iterator<E> it = innerSet.iterator();
        E el;
        Collection<E> elementsToRemove = new ArrayList<E>();
        while(it.hasNext()){
            el = it.next();
            if(!c.contains(el)){
                elementsToRemove.add(el); //No changing the set while the iterator is going. Iterator.remove may not do what we want.
            }
        }
        for(E e: elementsToRemove){
            remove(e);
        }
        return !elementsToRemove.isEmpty(); //If it's empty there is no change and we should return false
    }
    public boolean removeAll(Collection<?> c) {
        boolean result = false;
        for(Object e: c){
            result = result || remove(e);
        }
        return result;
    }
    public void clear() {
        Iterator<E> it = innerSet.iterator();
        E el;
        while(it.hasNext()){
            el = it.next();
            el.deleteObserver(this);
        }
        innerSet.clear();
    }
}

This incurs a performance hit every time the mutable objects change.

2. Check for changes when Set is used

  • Works with any existing object you want to put into your set
  • Need to scan the entire set every time you require info about the set (performance cost may get significant if your set gets very large).

If the objects in your set change often, but the set itself is used rarely, you could try Joe's solution below. He suggests to check whether the Set is still correct whenever you call a method on it. As a bonus, his method will work on any set of objects (no having to limit it to observables). Performance-wise his method would be problematic for large sets or often used sets (as the entire set needs to be checked at every method call).

Possible implementation of Joe's method:

package collectiontests.check;

import java.util.ArrayList;
import java.util.Collection;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Set;

public class ListBasedSet<E> {

    private List<E> innerList;

    public ListBasedSet(){
        this(null);
    }

    public ListBasedSet(Collection<E> elements){
        if (elements != null){
            innerList = new ArrayList<E>(elements);
        } else {
            innerList = new ArrayList<E>();
        }
    }

    public void add(E e){
        innerList.add(e);
    }

    public int size(){
        return toSet().size();
    }

    public Iterator<E> iterator(){
        return toSet().iterator();
    }

    public void remove(E e){
        while(innerList.remove(e)); //Keep removing until they are all gone (so set behavior is kept)
    }

    public boolean contains(E e){
        //I think you could just do innerList.contains here as it shouldn't care about duplicates
        return innerList.contains(e);
    }

    private Set<E> toSet(){
        return new HashSet<E>(innerList);
    }
}

And another implementation of the check always method (this one based on an existing set). This is the way to go if you want to reuse the existing sets as much as possible.

package collectiontests.check;

import java.util.Collection;
import java.util.Comparator;
import java.util.Iterator;
import java.util.NavigableSet;
import java.util.SortedSet;
import java.util.TreeSet;

public class ChangeDetectingSet<E> extends TreeSet<E> {

    private boolean compacting = false;

    @SuppressWarnings("unchecked")
    private void compact(){
        //To avoid infinite loops, make sure we are not already compacting (compact also gets called in the methods used here)
        if(!compacting){ //Warning: this is not thread-safe
            compacting = true;
            Object[] elements = toArray();
            clear();
            for(Object element: elements){
                add((E)element); //Yes unsafe cast, but we're rather sure
            }
            compacting = false;
        }
    }
    @Override
    public boolean add(E e) {
        compact();
        return super.add(e);
    }
    @Override
    public Iterator<E> iterator() {
        compact();
        return super.iterator();
    }
    @Override
    public Iterator<E> descendingIterator() {
        compact();
        return super.descendingIterator();
    }
    @Override
    public NavigableSet<E> descendingSet() {
        compact();
        return super.descendingSet();
    }
    @Override
    public int size() {
        compact();
        return super.size();
    }
    @Override
    public boolean isEmpty() {
        compact();
        return super.isEmpty();
    }
    @Override
    public boolean contains(Object o) {
        compact();
        return super.contains(o);
    }
    @Override
    public boolean remove(Object o) {
        compact();
        return super.remove(o);
    }
    @Override
    public void clear() {
        compact();
        super.clear();
    }
    @Override
    public boolean addAll(Collection<? extends E> c) {
        compact();
        return super.addAll(c);
    }
    @Override
    public NavigableSet<E> subSet(E fromElement, boolean fromInclusive, E toElement, boolean toInclusive) {
        compact();
        return super.subSet(fromElement, fromInclusive, toElement, toInclusive);
    }
    @Override
    public NavigableSet<E> headSet(E toElement, boolean inclusive) {
        compact();
        return super.headSet(toElement, inclusive);
    }
    @Override
    public NavigableSet<E> tailSet(E fromElement, boolean inclusive) {
        compact();
        return super.tailSet(fromElement, inclusive);
    }
    @Override
    public SortedSet<E> subSet(E fromElement, E toElement) {
        compact();
        return super.subSet(fromElement, toElement);
    }
    @Override
    public SortedSet<E> headSet(E toElement) {
        compact();
        return super.headSet(toElement);
    }
    @Override
    public SortedSet<E> tailSet(E fromElement) {
        compact();
        return super.tailSet(fromElement);
    }
    @Override
    public Comparator<? super E> comparator() {
        compact();
        return super.comparator();
    }
    @Override
    public E first() {
        compact();
        return super.first();
    }
    @Override
    public E last() {
        compact();
        return super.last();
    }
    @Override
    public E lower(E e) {
        compact();
        return super.lower(e);
    }
    @Override
    public E floor(E e) {
        compact();
        return super.floor(e);
    }
    @Override
    public E ceiling(E e) {
        compact();
        return super.ceiling(e);
    }
    @Override
    public E higher(E e) {
        compact();
        return super.higher(e);
    }
    @Override
    public E pollFirst() {
        compact();
        return super.pollFirst();
    }
    @Override
    public E pollLast() {
        compact();
        return super.pollLast();
    }
    @Override
    public boolean removeAll(Collection<?> c) {
        compact();
        return super.removeAll(c);
    }
    @Override
    public Object[] toArray() {
        compact();
        return super.toArray();
    }
    @Override
    public <T> T[] toArray(T[] a) {
        compact();
        return super.toArray(a);
    }
    @Override
    public boolean containsAll(Collection<?> c) {
        compact();
        return super.containsAll(c);
    }
    @Override
    public boolean retainAll(Collection<?> c) {
        compact();
        return super.retainAll(c);
    }
    @Override
    public String toString() {
        compact();
        return super.toString();
    }
}

3. Use Scala sets

You could cheat and do away with mutable objects (in the sense that instead of mutating, you'd create a new one with one property changed) in your set. You can look at the set in Scala (I thought it was possible to call Scala from Java, but I'm not 100% sure): http://www.scala-lang.org/api/current/scala/collection/immutable/IndexedSeq.html