Sumanth
Sumanth

Reputation: 507

Expand the dataframe by a column value

Lets say I have a dataframe df in this format: I have shown for only id = 1 but I have several id values.

  Email       id    Product   Value
[email protected]    1      p_1       1
[email protected]    1      p_2       2
[email protected]    1      p_3       3
[email protected]    1      p_4       4

Now I need the result in this format:

  Email       id    Product   Value  Product_n   Value_n
[email protected]    1      p_1       1        p_2        2
[email protected]    1      p_1       1        p_3        3
[email protected]    1      p_1       1        p_4        4
[email protected]    1      p_2       2        p_1        1
[email protected]    1      p_2       2        p_3        3
[email protected]    1      p_2       2        p_4        4   ....so on

I take in each item in Product of specific id and then create two new columns with the rest of the Product items.

I tried the same by using a for loop but for around 1000 id values and 9 Product items for each id it takes me lot of time. Can we make the process quicker.

product_items = df['Product'].unique()
for idx in range(1,(len(df['id'].unique())+1)):
    for i in product_items:
        for j in product_items:
            if j != i:
                df1 = df.loc[(df['id'] == idx) & (df['Product'] == i)]
                df2 = (df.loc[df['Product'] == j]).iloc[:,2:4]
                df_add = pd.concat([df1,df2],axis =1)
                df_final = df_final.append(df_add)

Upvotes: 3

Views: 183

Answers (1)

Mohamed Thasin ah
Mohamed Thasin ah

Reputation: 11192

try this,

temp=df[['Product','Value']]
temp_df=pd.DataFrame()
for val in temp['Product'].unique().tolist():
    temp_df=pd.concat([temp_df,temp.copy().assign(key=val)])
df= pd.merge(df,temp_df,left_on=['Product'],right_on=['key'],how='right')
df=df.rename(columns={'Product_x':'Product','Value_x':'Value','Product_y':'Product_n','Value_y':'Value_n'})
df=df[df['Product']!=df['Product_n']]

Explanation:

  1. temp contains only required columns, temp_df explodes your dataframe and assigns a key for merge
  2. temp_df contains duplication of your main df to perform merge.
  3. Do right join based on the product and key
  4. clean up the data

Upvotes: 1

Related Questions