Reputation: 725
At the moment, I've have created a bipartite networkx graph that maps Disorders to Symptoms. So, a disorder may be linked to one or more Symptoms. Also, i have some basic statistics, like, Symptoms with at least one Disorder etc.
import networkx as nx
csv_dictionary = {"Da": ["A", "C"], "Db": ["B"], "Dc": ["A", "C", "F"], "Dd": ["D"], "De": ["E", "B"], "Df":["F"], "Dg":["F"], "Dh":["F"]}
G = nx.Graph()
all_symptoms = set()
for disorder, symptoms in csv_dictionary.items():
for i in range (0, len(symptoms)):
G.add_edge(disorder, symptoms[i])
all_symptoms.add(symptoms[i])
symptoms_with_multiple_diseases = [symptom for symptom in all_symptoms if G.degree(symptom) > 1]
sorted_symptoms = list(sorted(symptoms_with_multiple_diseases, key= lambda symptom:
G.degree(symptom)))
What i need is to find Disorders that share at least two Symptoms. So, Disorders that have two Symptoms in common with each other. I've done some research and i think i should add weights for my edges, based on how they connect, but i cannot wrap my head around it.
So, in the above example, Da and Dc share two Symptoms ( A and C ).
Upvotes: 1
Views: 370
Reputation: 88236
You could iterate over the length 2
combinations of the disorder
nodes with a centrality higher than 2
, and find the nx.common_neighbours
of each combination, keeping only those that share at least 2
neighbours.
So start instead by keeping track of all disorders too:
all_symptoms = set()
all_disorders = set()
for disorder, symptoms in csv_dictionary.items():
for i in range (0, len(symptoms)):
G.add_edge(disorder, symptoms[i])
all_symptoms.add(symptoms[i])
all_disorders.add(disorder)
Check which have a degree higher than 2
:
disorders_with_multiple_diseases = [symptom for symptom in all_disorders
if G.degree(symptom) > 1]
And then iterate over all 2
combinations of all_dissorders
:
from itertools import combinations
common_symtpoms = dict()
for nodes in combinations(all_disorders, r=2):
cn = list(nx.common_neighbors(G, *nodes))
if len(cn)>1:
common_symtpoms[nodes] = list(cn)
print(common_symtpoms)
# {('Da', 'Dc'): ['A', 'C']}
Upvotes: 1