I have 2 dictionaries, A and B. A has 700000 key-value pairs and B has 560000 key-values pairs. All key-value pairs from B are present in A, but some keys in A are duplicates with different values and some have duplicated values but unique keys. I would like to subtract B from A, so I can get the remaining 140000 key-value pairs. When I subtract key-value pairs based on key identity, I remove lets say 150000 key-value pairs because of the repeated keys. I want to subtract key-value pairs based on the identity of BOTH key AND value for each key-value pair, so I get 140000. Any suggestion would be welcome.
This is an example:
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}
I DO want to get: A-B = {'10':1, '12':1, '10':2, '11':3}
I DO NOT want to get:
a) When based on keys:
{'10':1, '12':1, '10':2}
or
b) When based on values:
{'11':3}
To get items in A that are not in B, based just on key:
C = {k:v for k,v in A.items() if k not in B}
To get items in A that are not in B, based on key and value:
C = {k:v for k,v in A.items() if k not in B or v != B[k]}
To update A in place (as in A -= B
) do:
from collections import deque
consume = deque(maxlen=0).extend
consume(A.pop(key, None) for key in B)
(Unlike using map() with A.pop
, calling A.pop
with a None default will not break if a key from B is not present in A. Also, unlike using all
, this iterator consumer will iterate over all values, regardless of truthiness of the popped values.)