annoyed_avocado
annoyed_avocado

Reputation: 15

Going through a Pandas dataframe to read values into new dataframe

I have a dataframe that looks something like the following. Each of the columns el0 to el3 holds the name of an element, and each of the columns x0 to x3 holds the amount of the corresponding element (el0 and x0, el1 and x1, etc.) that is present in an alloy.

el0 el1 el2 el3 x0 x1 x2 x3
AL MO CR TI 0.9 0.0 0.0 0.1
AL MO CR TI 0.8 0.0 0.1 0.1

I want to turn the dataset into something that looks like the below, where each of the columns is an element, and the rows tell me how much of that element is present. In essence, taking the unique values from el0 to el3 in the original dataframe above, making them the column headers of the new dataframe, and plugging in x0 through x3 from the original dataframe into the correct slot.

AL MO CR TI
0.9 0.0 0.0 0.1
0.8 0.0 0.1 0.1

Right now, I just loop through each of the rows, access the element held in each of el0 through el3, use that element as a key in a dictionary, and then go pull the corresponding amount from x0, x1, x2, or x3 for the value, and plug that dictionary into the end of the new dataframe (code below). My dataset isn't particularly large, so I'm perfectly content to just let my computer sit for the minute it takes to do this. However, I'm curious if there's a more elegant, Panda-esque solution that isn't as gnarly as the one I'm using.

Thanks in advance!

for index, row in original_df.iterrows():
    
    d = {}
    for i in unique_features_list:
        d[i] = 0 #initializing dict value to 0
    for col in original_df.columns:
        if col == 'el0':
            element = row['el0']
            d[element] = row['x0']
        if col == 'el1':
            element = row['el1']
            d[element] = row['x1']
        if col == 'el2':
            element = row['el2']
            d[element] = row['x2']
        if col == 'el3':
            element = row['el3']
            d[element] = row['x3']
            
    features_df.loc[len(features_df)] = [*d.values()] #append dict to end of new dataframe

Upvotes: 0

Views: 32

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Try using pd.wide_to_long and reshape the dataframe using set_index and unstack:

pd.wide_to_long(df.reset_index(), ['el','x'], 'index', 'No')\
  .reset_index(level=1, drop=True)\
  .set_index('el', append=True)['x']\
  .unstack()

Output (Note: typo in data above):

el      AL   Al   CR   MO   TI
index                         
0      NaN  0.9  0.0  0.0  0.1
1      0.8  NaN  0.1  0.0  0.1

Output (with corrected typo):

el      AL   CR   MO   TI
index                    
0      0.9  0.0  0.0  0.1
1      0.8  0.1  0.0  0.1

Upvotes: 2

Related Questions