I'm dealing with polygonal data in realtime here, but the problems quite simple. I have a huge list containing thousands of sets of polygon Indecies (Integers) and I need to simplify the list as "fast" as possible into a list of sets of "connected" Indecies. i.e. Any sets containing integers that are also in another set become one set in the result. I've read several possible solutions involving sets & graphs etc. All i'm after are a final list of sets which had any degree of commonality.
I'm dealing with lots of data here, but for simplicities sake here's some sample data:
setA = set([0,1,2])
setB = set([6,7,8,9])
setC = set([4,5,6])
setD = set([3,4,5,0])
setE = set([10,11,12])
setF = set([11,13,14,15])
setG = set([16,17,18,19])
listOfSets = [setA,setB,setC,setD,setE,setF,setG]
In this case I'm after a list with a result like this, although ordering is irrelevant:
connectedFacesListOfSets = [ set([0,1,2,3,4,5,6,7,8,9]), set([10,11,12,13,14,15]), set([16,17,18,19])]
I've looked for similar solutions, but the one with the highest votes gave incorrect results on my large test data.
It's hard to tell the performance without a sufficiently large set, but here is some basic code to start from:
while True:
merged_one = False
supersets = [listOfSets[0]]
for s in listOfSets[1:]:
in_super_set = False
for ss in supersets:
if s & ss:
ss |= s
merged_one = True
in_super_set = True
break
if not in_super_set:
supersets.append(s)
print supersets
if not merged_one:
break
listOfSets = supersets
This works in 3 iterations on the provided data. And the output is as follows:
[set([0, 1, 2, 3, 4, 5]), set([4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
[set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
[set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]