How to do recursive vectorized calculations in pandas DataFrame via columns?

Question

Edit: I have altered the sample data so that the 5th row is now gone due to an error in the sample.

Assume that we a directed graph G = (V, E) of edges E and vertices V. Assume that we have a Pandas DataFrame describing which nodes (u, v)are connected to each other and the value/weight of the corresponding edge e. Let the following be a representation of such a DataFrame.

#   from   to   weight
-----------------------
0     0     1     1.0
1     1     2     0.5
2     2     3     0.2     
3     0     4     1.3
4     4     5     0.9

Is it possible to somehow add a column with accumulated weights, such that for instance row 2 has an accumulated weight value of 1.7=0.2+0.5+1.0 since we have a path 0->1->2->3? Preferably in a vectorized so that the calculation scales. In other words, we should get the following DataFrame.

#   from   to   weight    accumulated
-------------------------------------
0     0     1     1.0      1.0
1     1     2     0.5      1.5
2     2     3     0.2      1.7   
3     0     4     1.3      1.3
4     4     5     0.9      2.2

We can assume that there is no other path to vertex 3 since the DataFrame is made such that only shortest paths are included.

I have thus far written the following piece of code that uses DataFrame.apply, which is not a vectorized approach. Here I store / cache previously calculated accumulated values in a dictionary called accum_map.

def __set_accum(self, row):
    search = row["to"]
    if search in self.accum_map:
        return self.accum_map[search]
    from_node = row["from"]
    old_from = self.df[self.df["to"] == from_node].get("from")
    old_from = None if old_from.empty else old_from.values[0]
    weight = row["weight"]
    self.accum_map[search] = self.__set_accum({"to": from, "from": old_from}) + weight
    return self.accum_mapp[search]

def set_accumulated(self):
    self.df["accumulated"] = self.df.apply(func=self.__set_timestamp, axis=1)

How to do recursive vectorized calculations in pandas DataFrame via columns?

Answers (1)

Related Questions