S44
S44

Reputation: 473

Appending columns to my Dataframe based on row values for a certain column

I have a multi-index data frame with the following information:

                      pid    time    eventType  action 
sess_id     vis_id      

id1         vis_id1    id1    t_0      5         A
            vis_id1    id2    t_1      5         A-B
            vis_id1    id1    t_2      5         A-B-C
            vis_id1    id3    t_3      5         B

id2         vis_id2    id3    t_3      5         B
            vis_id2    id2    t_4      5         B-C
            vis_id2    id1    t_5      5         A

Context:

I want to create a df shows how users traversed through a certain event type with respect to time. Above I grouped every session id and sub_id, and I sorted by time resulting in action being how users traversed through a website in real time!

I have a dictionary where I want to append a column to the df where if my action column contains some keyword, append the value word to the next column, leaving all others blank. If words are contained in another dictionary, append to a different column (for a total of 3 new columns) Example for two columns:

remap_col1 = { 
          'A':'Value1'
          'B':'Value2'
           }

remap_col2 = { 
          'A-B':'Value3'
          'B-C':'Value4'
           }

                         
                      pid    time    eventType  action  remap_col1 remap_col2
sess_id     vis_id      

id1         vis_id1    id1    t_0      5         A        Value1   
            vis_id1    id2    t_1      5         A-B                 Value3
            vis_id1    id1    t_2      5         A-B-C
            vis_id1    id3    t_3      5         B        Value2

id2         vis_id2    id3    t_3      5         B        Value2
            vis_id2    id2    t_4      5         B-C                 Value4
            vis_id2    id1    t_5      5         A        Value1

Upvotes: 1

Views: 46

Answers (1)

Mohsen Cheraghi
Mohsen Cheraghi

Reputation: 21

def condition(x, valueX, ValueY):
    if x == valueX[0]:
        return ValueY[0] 
    elif == valueX[1]:
        return ValueY[1]
    else:
        return np.nan


data['remap_col1'] = np.nan
data['remap_col2'] = np.nan

df = df.apply(lambda x: condition(x.action, x.remap_col1, ['A','B'],[value1,value2]), axis=1)

df = df.apply(lambda x: condition(x.action, x.remap_col2, ['A-B','B-C'],[value3,value4]), axis=1)

Upvotes: 1

Related Questions