user7096526
user7096526

Reputation:

create graph with correlation matrix

I am having some problems solving this question, I have a data frame and I need to create a graph from it.

My data frame looks like :

    A =pd.DataFrame([(1, 0.3, 0.4, 0.7),
                    (0.3, 1, 0.9, 0.2),
                    (0.4, 0.9, 1, 0.1),
                    (0.7, 0.2, 0.1, 1)
                    ],  columns=['a', 'b', 'c', 'd'], index=['a', 'b', 'c', 'd'] )
    np.fill_diagonal(A.values, 0)
>>> A
     a    b    c    d
a  0.0  0.3  0.4  0.7
b  0.3  0.0  0.9  0.2
c  0.4  0.9  0.0  0.1
d  0.7  0.2  0.1  0.0  

I want to create a graph with this data. There are four nodes : a,b,c,d, and the distances between nodes are given by the matrix, for instance, distance between node a-b = 0.3 (since it is a correletaion matrix values are duplicated).

Thank you!!

I have created this function to store the values of edges as a dict (I don't know if that would be the best idea):

def edges(matr):
    edge = {}
    for m in matr.columns:
        for n in matr.index:
            a,b = m,n 
            if a!= b:
                x = matr.at[m, n]
                edge[m,n] = float("{0:.4f}".format(x))
    return edge

edges(A)

>>> edges(A)
{('a', 'b'): 0.3,
 ('a', 'c'): 0.4,
 ('a', 'd'): 0.7,
 ('b', 'a'): 0.3,
 ('b', 'c'): 0.9,
 ('b', 'd'): 0.2,
 ('c', 'a'): 0.4,
 ('c', 'b'): 0.9,
 ('c', 'd'): 0.1,
 ('d', 'a'): 0.7,
 ('d', 'b'): 0.2,
 ('d', 'c'): 0.1}

But since a-b is the same as b-a some edges are repeated, I cant figure out how to remove the repeated values. And from that data I need to create a graph/picture.

Thank you!!

Upvotes: 2

Views: 1699

Answers (1)

doctorlove
doctorlove

Reputation: 19232

There are various ways of creating a picture. I'll show graphviz.

First, your edges function adds the edges twice (from say a to b and back again), so let's just add them once.

import graphviz as gv

def edges(matr):
    edge = {}
    for m in matr.columns:
        for n in matr.index:
            a,b = m,n 
            if a > b: #only add edge once
                x = matr.at[m, n]
                edge[m,n] = float("{0:.4f}".format(x))
    return edge

if __name__ == '__main__':
    A =pd.DataFrame([(1, 0.3, 0.4, 0.7),
                (0.3, 1, 0.9, 0.2),
                (0.4, 0.9, 1, 0.1),
                (0.7, 0.2, 0.1, 1)
                ],  columns=['a', 'b', 'c', 'd'], index=['a', 'b', 'c', 'd'] )
    np.fill_diagonal(A.values, 0)

    e = edges(A)
    g = gv.Graph(format="png")
    for k, v in e.iteritems():
        g.edge(k[0], k[1], len=str(v))

    print str(g)

This gives you a graphviz format

graph {
        b -- a [len=0.3]
        c -- a [len=0.4]
        c -- b [len=0.9]
        d -- a [len=0.7]
        d -- c [len=0.1]
        d -- b [len=0.2]
}

You could send it to the graphviz tools externally if you save it to a .dot file, e.g. dot -Tps g.dot -o g.png or do g.render('filename', view = True) in python.

Generated graph

Upvotes: 1

Related Questions