K_Raikar
K_Raikar

Reputation: 126

How to get a transition string per row object based on two different columns in python (without using loops)?

I have the following data structure:

enter image description here

The columns s and d are indicating the transition of object in column x. What I want to do is get a transition string per object present in the column x. For e.g. with a new column as follows:

enter image description here

Is there a lean way to do it using pandas, without using too many loops?

This was the code I tried:

obj = df['x'].tolist()
rows = []

for o in obj:
    locs = df[df['x'] == o]['s'].tolist()
    str_locs = '->'.join(str(l) for l in locs)
    print(str_locs)
    d = dict()
    d['x'] = o
    d['new'] = str_locs
    rows.append(d)

tmp = pd.DataFrame(rows)

This give the output temp as:

    x   new
    a   1->2->4->8
    a   1->2->4->8
    a   1->2->4->8
    a   1->2->4->8
    b   1->2
    b   1->2

Upvotes: 1

Views: 160

Answers (1)

Hamza usman ghani
Hamza usman ghani

Reputation: 2243

Example df:

df = pd.DataFrame({"x":["a","a","a","a","b","b"], "s":[1,2,4,8,5,11],"d":[2,4,8,9,11,12]})

print(df)

       x    s   d
    0   a   1   2
    1   a   2   4
    2   a   4   8
    3   a   8   9
    4   b   5   11
    5   b   11  12

Following code will generate a transition string of all objects present in the column x.

  • groupby with respect to column x and get list of lists of s and d for every object available in x
  • Merge the list of lists sequentially
  • Remove consecutive duplicates from the merged list using itertools.groupby
  • Join the items of merged list with -> to make it a single string.
  • Finally map the series to column x of input df
from itertools import groupby 

grp = df.groupby('x')[['s', 'd']].apply(lambda x: x.values.tolist())
grp = grp.apply(lambda x: [str(item) for tup in x for item in tup])
sr = grp.apply(lambda x: "->".join([i[0] for i in groupby(x)]))
df["new"] = df["x"].map(sr)
print(df)

       x    s   d   new
    0   a   1   2   1->2->4->8->9
    1   a   2   4   1->2->4->8->9
    2   a   4   8   1->2->4->8->9
    3   a   8   9   1->2->4->8->9
    4   b   5   11  5->11->12
    5   b   11  12  5->11->12

Upvotes: 1

Related Questions