Removing entries from a HashMap based on value

Bosh picture Bosh · Mar 7, 2015 · Viewed 7.1k times · Source

I've written the following code (+ demo) to remove entries from a HashMap based on value. It works, but I feel like I'm struggling against the borrow-checker with the use of:

  • clone() to avoid two references to the same set of keys
  • an extra let tmp = binding to increase the lifetime of my temp value

use std::collections::HashMap;

fn strip_empties(x: &mut HashMap<String, i8>) {
    let tmp = x.clone();
    let empties = tmp
         .iter()
         .filter(|&(_, &v)| v == 0)
         .map(|(k, _)| k);

    for k in empties { x.remove(k); }
}

fn main() {
    let mut x: HashMap<String, i8> = HashMap::new();
    x.insert("a".to_string(), 1);
    x.insert("b".to_string(), 0);
    strip_empties(&mut x);

    println!("Now down to {:?}" , x);
}

Is there a cleaner, more idiomatic way to accomplish this?

Answer

user4316209 picture user4316209 · Mar 7, 2015

Why the mutation of the HashMap? Just create a new one (all hail immutability):

fn strip_empties(x: HashMap<String, i8>) -> HashMap<String, i8> {
    return x.into_iter()
        .filter(|&(_, v)| v != 0)
        .collect();
}

Playpen


Edit: Why this is feasible.

Of course you have to consider your use case. The best approach may vary if you have a large HashMap or filter many/few elements. Lets compare the implementations.

use std::collections::HashMap;

fn strip_empties_mutable(x: &mut HashMap<String, i8>) {
    let empties: Vec<_> = x
        .iter()
        .filter(|&(_, &v)| v == 0)
        .map(|(k, _)| k.clone())
        .collect();
    for empty in empties { x.remove(&empty); }
}

fn strip_empties_immutable(x: HashMap<String, i8>) -> HashMap<String, i8> {
    return x.into_iter()
        .filter(|&(_, v)| v != 0)
        .collect();
}

fn build_hashmap() -> HashMap<String, i8> {
    let mut map = HashMap::new();
    for chr in "abcdefghijklmnopqrstuvmxyz".chars() {
        map.insert(chr.to_string(), chr as i8 % 2);
    }
    return map;
}

#[cfg(mutable)]
fn main() {
    let mut map = build_hashmap();
    strip_empties_mutable(&mut map);
    println!("Now down to {:?}" , map);
}

#[cfg(immutable)]
fn main() {
    let mut map = build_hashmap();
    map = strip_empties_immutable(map);
    println!("Now down to {:?}" , map);
}

Save this as hashmap.rs and run:

rustc --cfg mutable -O -o mutable hashmap.rs
rustc --cfg immutable -O -o immutable hashmap.rs

If we look at the different runtimes (e.g. using perf stat -r 1000 ./XXX) we don't see significant differences.

But lets look at the number of allocations:

valgrind --tool=callgrind --callgrind-out-file=callgrind_mutable ./mutable
valgrind --tool=callgrind --callgrind-out-file=callgrind_immutable ./immutable
callgrind_annotate callgrind_mutable | grep 'je_.*alloc'
callgrind_annotate callgrind_immutable | grep 'je_.*alloc'
  • callgrind_mutable:

    7,000  ???:je_arena_malloc_small [$HOME/hashmap/mutable]
    6,457  ???:je_arena_dalloc_bin_locked [$HOME/hashmap/mutable]
    4,800  ???:je_mallocx [$HOME/hashmap/mutable]
    3,903  ???:je_sdallocx [$HOME/hashmap/mutable]
    2,520  ???:je_arena_dalloc_small [$HOME/hashmap/mutable]
      502  ???:je_rallocx [$HOME/hashmap/mutable]
      304  ???:je_arena_ralloc [$HOME/hashmap/mutable]
    
  • callgrind_immutable:

    5,114  ???:je_arena_malloc_small [$HOME/hashmap/immutable]
    4,725  ???:je_arena_dalloc_bin_locked [$HOME/hashmap/immutable]
    3,669  ???:je_mallocx [$HOME/hashmap/immutable]
    2,980  ???:je_sdallocx [$HOME/hashmap/immutable]
    1,845  ???:je_arena_dalloc_small [$HOME/hashmap/immutable]
      158  ???:je_rallocx [$HOME/hashmap/immutable]
    

And this is not very suprising as the clone() calls in the mutable approach allocates memory aswell. Of course the mutable version might yield a HashMap with a larger capacity.