aeapen
aeapen

Reputation: 923

How to create linked array of dataframe column based on key using Pandas?

I have a dataframe df like this. I need to create linked list of array of df['Id'] based on df['code'].

Input (df)

Id    code  description                start          end          lat    lon
23-A   45   Fault located at Mumbai   2021-03-21      2021-03-28   19.07  72.08
35-B   24   Fault located at Chennai  2021-02-24      2021-02-26   13.02  80.27
37-B   28   Fault located at Chennai  2021-02-24      2021-02-26   13.02  80.07
41-A   45   Fault located at Mumbai   2021-03-21      2021-03-28   
38-B   24   Fault located at Chennai  2021-02-24      2021-02-26   13.02  80.07
27-A   45   Fault located at Mumbai   2021-03-21      2021-03-28   19.07  72.08
78-B   56   Fault located at Chennai  2021-02-24      2021-02-26  
21-C   46   Fault located at Mumbai   2021-04-21      2021-04-28   

Expected Output

 linkedId          code  description                start          end          lat  lon  
   23-A,41A,27-A      45   Fault located at Mumbai   2021-03-21    2021-03-28   19.07  72.08
    35-B,38-B         24   Fault located at Chennai  2021-02-24    2021-02-26   13.02  80.07
    37-B              28   Fault located at Chennai  2021-02-24    2021-02-26   13.02  80.07
    78-B              56   Fault located at Chennai  2021-02-24    2021-02-26  
    21-C              46   Fault located at Mumbai   2021-04-21    2021-04-28  

How can this be done in pandas

Upvotes: 0

Views: 78

Answers (3)

Nk03
Nk03

Reputation: 14949

TRY:

result = (
    df.assign(Id=df.groupby('code')['Id']
              .transform(','.join))
    .drop_duplicates(subset='code')
    .rename(columns={'Id': 'linkedId'})
)

Upvotes: 1

wwnde
wwnde

Reputation: 26676

Use groupby transform, str.cat and drop duplicates

df=df.assign(linkedid=df.groupby(['description','start','end'])['Id'].transform(lambda X:X.str.cat(sep=','))).drop_duplicates(subset=['linkedid'])




    Id  code               description        start          end    lat  \
0   23-A    45   Fault located at Mumbai   2021-03-21   2021-03-28  19.07   
1   35-B    24  Fault located at Chennai   2021-02-24   2021-02-26  13.02   
4  38-B     24  Fault located at Chennai  2021-02-24   2021-02-26   13.02   
5  27-A     45   Fault located at Mumbai   2021-03-21   2021-03-28  19.07   
7   21-C    46   Fault located at Mumbai   2021-04-21   2021-04-28          

     lon         linkedid  
0  72.08        23-A,41-A  
1  80.27  35-B,37-B,78-B   
4  80.07            38-B   
5  72.08            27-A   
7                    21-C  

Upvotes: 0

Fahad Vadakkumpadatah
Fahad Vadakkumpadatah

Reputation: 71

df.groupby('code').sum()

Result:

    Id
code    
24  35-B38-B
28  37-B
45  23-A41-A27-A
46  21-C
56  78-B

Upvotes: 0

Related Questions