Why can't I remove a string from a std::set with std::remove_if?

Benj picture Benj · Jun 21, 2012 · Viewed 7k times · Source

Possible Duplicate:
remove_if equivalent for std::map

I have a set of strings:

set <wstring> strings;
// ...

I wish to remove strings according to a predicate, e.g.:

std::remove_if ( strings.begin(), strings.end(), []( const wstring &s ) -> bool { return s == L"matching"; });

When I attempt this, I get the following compiler error:

c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\algorithm(1840): error C2678: binary '=' : no operator found which takes a left-hand operand of type 'const std::basic_string<_Elem,_Traits,_Ax>' 

The error appears to suggest that std::string doesn't have a by-value copy constructor ( which would be illegal). Is it somehow bad to use std::remove_if with std::set ? Should I be doing something else instead such as several iterations of set::find() followed by set::erase() ?

Answer

Potatoswatter picture Potatoswatter · Jun 21, 2012

std::remove_if (or std::erase) works by reassigning the values of the members of the range. It doesn't understand how std::set organizes data, or how to remove a node from its internal tree data structure. Indeed, it's impossible to do so using only references to nodes, without having the set object itself.

The standard algorithms are designed to have transparent (or at least consistently easy-to-remember) computational complexities. A function to selectively remove elements from a set would be O(N log N), due to the need to rebalance the tree, which is no better than a loop calling my_set.remove() . So, the standard doesn't provide it, and that is what you need to write.

On the other hand, a naively hand-coded loop to remove items from a vector one-by-one would be O(N^2), whereas std::remove_if is O(N). So the library does provide a tangible benefit in that case.

A typical loop (C++03 style):

for ( set_t::iterator i = my_set.begin(); i != my_set.end(); ) {
    if ( condition ) {
        my_set.erase( i ++ ); // strict C++03
        // i = my_set.erase( i ); // more modern, typically accepted as C++03
    } else {
        ++ i; // do not include ++ i inside for ( )
    }
}

Edit (4 years later!): i ++ looks suspicious there. What if erase invalidates i before the post-increment operator can update it? This is fine, though, because it's an overloaded operator++ rather than the built-in operator. The function safely updates i in-place and then returns a copy of its original value.