Kartik Mehra
Kartik Mehra

Reputation: 107

How to append column side by side in Pandas Dataframe

I have a dataframe like this

TEST_NUM    SITE_NUM    RESULT  TEST_FLG    TEST_TXT    UNITS   LO_LIMIT    HI_LIMIT
0   100        0    -0.4284 P   Continuity_PPMU XSCI    V             -1    -0.3
1   100        1    -0.4274 P   Continuity_PPMU XSCI    V             -1    -0.3
2   100        2    -0.4276 P   Continuity_PPMU XSCI    V             -1    -0.3
3   100        3    -0.4289 P   Continuity_PPMU XSCI    V             -1    -0.3
4   101        0    -0.4569 P   Continuity_PPMU XSCO    V             -1    -0.3

The TEST_TXT has 53 unique values.

I want my dataframe to be like this

LDO_Discharge V12  | Continuity_PPMU XSCI | Continuity_PPMU XSCO |Continuity_PPMU ADBUS0 |Continuity_PPMU ADBUS1 ....
 1.04              |3.343                                      |1.91    | 2.1 | 3.1

Basically, all values of RESULT column of different TEST_TXT side by side, as a column. But the trick here is, LDO_Discharge V12 has 5512 values, Continuity_PPMU ADBUS0 has 5528 values. They need to be side-by-side on the basic of SITE_NUM.

So, First the row of LDO_Discharge V12 with SITE_NUM = 0, should have First row of Continuity_PPMU ADBUS0 with SITE_NUM = 0 and so on. They should joined such that they have same SITE_NUM.

I would have done this easily if the SITE_NUM was unique or their count were equivalent, but it is not so (5512 for 'LDO_Discharge V12' vs 5528 for 'Continuity_PPMU ADBUS0' or other value).

I want to ask how to combine such that "Continuity_PPMU ADBUS0" SITE_NUM goes with LDO_Discharge V12's SITE_NUM in the order.

And if there are no values for a particular set(say SITE_NUM = 3 is missing for "Continuity_PPMU XSCI", it is possible as counts are different for different "TEST_TXT"s) , it should leave a NULL there.

It was hard to explain like this. Please let me know if some more clarification is required.

Upvotes: 1

Views: 308

Answers (1)

Felipe Whitaker
Felipe Whitaker

Reputation: 520

I didn't really understood what your expected output is supposed to be like, so it might help if you could clarify that. However, from what I gathered, you should take a look at some pandas functions. Therefore:

Check out axis = 1 argument of pd.concat, which let's you concatenate in the direction of the rows.

df = pd.concat(iterator, axis = 1) # returns a DataFrame

But maybe you want to check out pd.DataFrame.groupby, and then pd.DataFrame.agg and after grouping you should take a look at pd.DataFrame.sort_values, which would be something like:

df = pd.DataFrame()
gdf = df.groupby(by = columns_to_group).agg({column_agg: function_agg, ...}) # returns DataFrame after `agg`

Upvotes: 1

Related Questions