Reputation: 725
I've created a bipartite networkx graph from a CSV file that maps Disorders to Symptoms. So, a disorder may be linked to one or more Symptoms.
for disorder, symptoms in csv_dictionary.items():
for i in range (0, len(symptoms)):
G.add_edge(disorder, symptoms[i])
What I need is to find what Symptoms are connected to multiple diseases and sort them according to their weight. Any suggestions?
Upvotes: 1
Views: 49
Reputation: 4892
You can use degree
of the created graph. Every symptom with degree larger than 1 belongs to at least two diseases:
I've added some example csv_dictionary
(please supply it in your next question as minimal reproducible example) and created a set of all symptoms during the creation of the graph. You could also think about adding these information as node feature to the graph.
import networkx as nx
csv_dictionary = {"a": ["A"], "b": ["B"], "c": ["A", "C"], "d": ["D"], "e": ["E", "B"], "f":["F"], "g":["F"], "h":["F"]}
G = nx.Graph()
all_symptoms = set()
for disorder, symptoms in csv_dictionary.items():
for i in range (0, len(symptoms)):
G.add_edge(disorder, symptoms[i])
all_symptoms.add(symptoms[i])
symptoms_with_multiple_diseases = [symptom for symptom in all_symptoms if G.degree(symptom) > 1]
print(symptoms_with_multiple_diseases)
# ['B', 'F', 'A']
sorted_symptoms = list(sorted(symptoms_with_multiple_diseases, key= lambda symptom: G.degree(symptom)))
print(sorted_symptoms)
# ['B', 'A', 'F']
Upvotes: 1