Cumatru
Cumatru

Reputation: 725

Find nodes with common connections

At the moment, I've have created a bipartite networkx graph that maps Disorders to Symptoms. So, a disorder may be linked to one or more Symptoms. Also, i have some basic statistics, like, Symptoms with at least one Disorder etc.

import networkx as nx

csv_dictionary = {"Da": ["A", "C"], "Db": ["B"], "Dc": ["A", "C", "F"], "Dd": ["D"], "De": ["E", "B"], "Df":["F"], "Dg":["F"], "Dh":["F"]}

G = nx.Graph()

all_symptoms = set()
for disorder, symptoms in csv_dictionary.items():
    for i in range (0, len(symptoms)):
        G.add_edge(disorder, symptoms[i])

        all_symptoms.add(symptoms[i])

symptoms_with_multiple_diseases = [symptom for symptom in all_symptoms if G.degree(symptom) > 1]

sorted_symptoms = list(sorted(symptoms_with_multiple_diseases, key= lambda symptom: 
G.degree(symptom)))

What i need is to find Disorders that share at least two Symptoms. So, Disorders that have two Symptoms in common with each other. I've done some research and i think i should add weights for my edges, based on how they connect, but i cannot wrap my head around it.

So, in the above example, Da and Dc share two Symptoms ( A and C ).

Upvotes: 1

Views: 370

Answers (1)

yatu
yatu

Reputation: 88236

You could iterate over the length 2 combinations of the disorder nodes with a centrality higher than 2, and find the nx.common_neighbours of each combination, keeping only those that share at least 2 neighbours.

So start instead by keeping track of all disorders too:

all_symptoms = set()
all_disorders = set()

for disorder, symptoms in csv_dictionary.items():
    for i in range (0, len(symptoms)):
        G.add_edge(disorder, symptoms[i])
        all_symptoms.add(symptoms[i])
    all_disorders.add(disorder)

Check which have a degree higher than 2:

disorders_with_multiple_diseases = [symptom for symptom in all_disorders 
                                    if G.degree(symptom) > 1]

And then iterate over all 2 combinations of all_dissorders:

from itertools import combinations

common_symtpoms = dict()
for nodes in combinations(all_disorders, r=2):
    cn = list(nx.common_neighbors(G, *nodes))
    if len(cn)>1:
        common_symtpoms[nodes] = list(cn)

print(common_symtpoms)
# {('Da', 'Dc'): ['A', 'C']}

Upvotes: 1

Related Questions