I've written the following code (+ demo) to remove entries from a HashMap
based on value. It works, but I feel like I'm struggling against the borrow-checker with the use of:
clone()
to avoid two references to the same set of keyslet tmp =
binding to increase the lifetime of my temp valueuse std::collections::HashMap;
fn strip_empties(x: &mut HashMap<String, i8>) {
let tmp = x.clone();
let empties = tmp
.iter()
.filter(|&(_, &v)| v == 0)
.map(|(k, _)| k);
for k in empties { x.remove(k); }
}
fn main() {
let mut x: HashMap<String, i8> = HashMap::new();
x.insert("a".to_string(), 1);
x.insert("b".to_string(), 0);
strip_empties(&mut x);
println!("Now down to {:?}" , x);
}
Is there a cleaner, more idiomatic way to accomplish this?
Why the mutation of the HashMap? Just create a new one (all hail immutability):
fn strip_empties(x: HashMap<String, i8>) -> HashMap<String, i8> {
return x.into_iter()
.filter(|&(_, v)| v != 0)
.collect();
}
Of course you have to consider your use case. The best approach may vary if you have a large HashMap or filter many/few elements. Lets compare the implementations.
use std::collections::HashMap;
fn strip_empties_mutable(x: &mut HashMap<String, i8>) {
let empties: Vec<_> = x
.iter()
.filter(|&(_, &v)| v == 0)
.map(|(k, _)| k.clone())
.collect();
for empty in empties { x.remove(&empty); }
}
fn strip_empties_immutable(x: HashMap<String, i8>) -> HashMap<String, i8> {
return x.into_iter()
.filter(|&(_, v)| v != 0)
.collect();
}
fn build_hashmap() -> HashMap<String, i8> {
let mut map = HashMap::new();
for chr in "abcdefghijklmnopqrstuvmxyz".chars() {
map.insert(chr.to_string(), chr as i8 % 2);
}
return map;
}
#[cfg(mutable)]
fn main() {
let mut map = build_hashmap();
strip_empties_mutable(&mut map);
println!("Now down to {:?}" , map);
}
#[cfg(immutable)]
fn main() {
let mut map = build_hashmap();
map = strip_empties_immutable(map);
println!("Now down to {:?}" , map);
}
Save this as hashmap.rs
and run:
rustc --cfg mutable -O -o mutable hashmap.rs
rustc --cfg immutable -O -o immutable hashmap.rs
If we look at the different runtimes (e.g. using perf stat -r 1000 ./XXX
) we
don't see significant differences.
But lets look at the number of allocations:
valgrind --tool=callgrind --callgrind-out-file=callgrind_mutable ./mutable
valgrind --tool=callgrind --callgrind-out-file=callgrind_immutable ./immutable
callgrind_annotate callgrind_mutable | grep 'je_.*alloc'
callgrind_annotate callgrind_immutable | grep 'je_.*alloc'
callgrind_mutable:
7,000 ???:je_arena_malloc_small [$HOME/hashmap/mutable]
6,457 ???:je_arena_dalloc_bin_locked [$HOME/hashmap/mutable]
4,800 ???:je_mallocx [$HOME/hashmap/mutable]
3,903 ???:je_sdallocx [$HOME/hashmap/mutable]
2,520 ???:je_arena_dalloc_small [$HOME/hashmap/mutable]
502 ???:je_rallocx [$HOME/hashmap/mutable]
304 ???:je_arena_ralloc [$HOME/hashmap/mutable]
callgrind_immutable:
5,114 ???:je_arena_malloc_small [$HOME/hashmap/immutable]
4,725 ???:je_arena_dalloc_bin_locked [$HOME/hashmap/immutable]
3,669 ???:je_mallocx [$HOME/hashmap/immutable]
2,980 ???:je_sdallocx [$HOME/hashmap/immutable]
1,845 ???:je_arena_dalloc_small [$HOME/hashmap/immutable]
158 ???:je_rallocx [$HOME/hashmap/immutable]
And this is not very suprising as the clone()
calls in the mutable approach
allocates memory aswell. Of course the mutable version might yield a HashMap
with a larger capacity.