bennimueller
bennimueller

Reputation: 31

How can I merge multiple date columns in a Pandas DataFrame into one column?

I found similar questions but they did not solve my problem.

I have this Pandas DataFrame. The dtypes of the columns can either be str or dt, I can change this afterwards:

    id_of_station   measurement1    measurement2    measurement3    measurement4    measurement5
0   7               NaN             NaN             NaN             NaN             NaN
1   21              2021-04-09      2021-04-09      2021-04-09      2021-04-09      NaN
2   28              2021-04-09      2021-04-09      2021-04-09      2021-04-09      NaN
3   31              2021-04-09      2021-04-09      2021-04-09      2021-04-09      2021-04-09 
4   42              2021-04-09      NaN             NaN             2021-04-09      NaN
... ...             ...             ...             ...             ...             ...
489 9546            NaN             NaN             2021-04-09      2021-04-09      NaN

What I want is to merge the date columns together to one new column. If there is no date for the specific ID like in id_of_station 7 the Output should be NaN.

So the Output should look similar to this:

    id_of_station   last_measurement    
0   7               NaN             
1   21              2021-04-09  
2   28              2021-04-09 
3   31              2021-04-09
4   42              2021-04-09
... ...             ...        
489 9546            2021-04-09

Upvotes: 0

Views: 767

Answers (2)

Anurag Dabas
Anurag Dabas

Reputation: 24304

make use of melt() method:

resultdf=df.melt(id_vars='id_of_station',value_name='last_measurement').drop(columns=['variable'])

OR

you can also do this by unstack() method

resultdf=df.set_index('id_of_station').unstack().droplevel(0).to_frame().rename(columns={0:'last_measurement'}).reset_index()

Now if you print resultdf you will get your desired output:

    id_of_station   last_measurement    
0   7               NaN             
1   21              2021-04-09  
2   28              2021-04-09 
3   31              2021-04-09
4   42              2021-04-09
... ...             ...        
489 9546            2021-04-09

Upvotes: 2

Ynjxsjmh
Ynjxsjmh

Reputation: 29982

You can use apply() on rows.

def merge(row):
    elems = row.dropna().tolist()
    return elems[-1] if elems else np.nan

df_ = pd.concat([df.iloc[:, :1], df.iloc[:, 1:].apply(merge, axis=1).rename('last_measurement')], axis=1,)
# print(df_)

   id_of_station  last_measurement
0              7         NaN
1             21  2021-04-09
2             28  2021-04-09
3             31  2021-04-09
4             42  2021-04-09
5           9546  2021-04-09

Upvotes: 1

Related Questions