Using NetworkX, and new to the library, for a social network analysis query. By Query, I mean select/create subgraphs by attributes of both edges nodes where the edges create a path, and nodes contain attributes. The graph is using a MultiDiGraph of the form
G2 = nx.MultiDiGraph()
G2.add_node( "UserA", { "type" :"Cat" } )
G2.add_node( "UserB", { "type" :"Dog" } )
G2.add_node( "UserC", { "type" :"Mouse" } )
G2.add_node( "Likes", { "type" :"Feeling" } )
G2.add_node( "Hates", { "type" :"Feeling" } )
G2.add_edge( "UserA", 'Hates' , statementid="1" )
G2.add_edge( "Hates", 'UserB' , statementid="1" )
G2.add_edge( "UserC", 'Hates' , statementid="2" )
G2.add_edge( "Hates", 'UserA' , statementid="2" )
G2.add_edge( "UserB", 'Hates' , statementid="3" )
G2.add_edge( "Hates", 'UserA' , statementid="3" )
G2.add_edge( "UserC", 'Likes' , statementid="3" )
G2.add_edge( "Likes", 'UserB' , statementid="3" )
Queried with
for node,data in G2.nodes_iter(data=True):
if ( data['type'] == "Cat" ):
# get all edges out from these nodes
#then recursively follow using a filter for a specific statement_id
#or get all edges with a specific statement id
# look for with a node attribute of "cat"
Is there a better way to query? Or is it best practice to create custom iterations to create subgraphs?
Alternatively (and a separate question), the Graph could be simplified, but I'm not using the below graph because the "hates" type objects will have predcessors. Would this make querying simpler? Seems easier to iterate over nodes
G3 = nx.MultiDiGraph()
G3.add_node( "UserA", { "type" :"Cat" } )
G3.add_node( "UserB", { "type" :"Dog" } )
G3.add_edge( "UserA", 'UserB' , statementid="1" , label="hates")
G3.add_edge( "UserA", 'UserB' , statementid="2" , label="hates")
Other notes:
add_path
adds an identifier to the path created? g.vs.select()
It's pretty straightforward to write a one-liner to make a list or generator of nodes with a specific property (generators shown here)
import networkx as nx
G = nx.Graph()
G.add_node(1, label='one')
G.add_node(2, label='fish')
G.add_node(3, label='two')
G.add_node(4, label='fish')
# method 1
fish = (n for n in G if G.node[n]['label']=='fish')
# method 2
fish2 = (n for n,d in G.nodes(data=True) if d['label']=='fish')
print(list(fish))
print(list(fish2))
G.add_edge(1,2,color='red')
G.add_edge(2,3,color='blue')
red = ((u,v) for u,v,d in G.edges(data=True) if d['color']=='red')
print(list(red))
If your graph is large and fixed and you want to do fast lookups you could make a "reverse dictionary" of the attributes like this,
labels = {}
for n, d in G.nodes(data=True):
l = d['label']
labels[l] = labels.get(l, [])
labels[l].append(n)
print labels