Vishal Naik
Vishal Naik

Reputation: 134

How to add column to a dataframe which remains constant with respect to date?

I have to dataframes df1 and df2
Contents of df1 are:

        date    high    low    close      C           B             T           R       
1   2020-09-16  1.29257 1.28146 1.28884 1.287623    1.287015    1.288232    0.000000    
2   2020-09-17  1.30074 1.28747 1.29662 1.294943    1.294105    1.295782    1.298102

Contents of df2 are:

     time               open    high    low     
0   2020-09-16 22:54:00 1.29708 1.29711 1.29695 
1   2020-09-16 22:55:00 1.29698 1.29703 1.29681 
2   2020-09-17 22:56:00 1.29701 1.29709 1.29689 
3   2020-09-17 22:57:00 1.29702 1.29720 1.29701 
4   2020-09-17 22:58:00 1.29717 1.29720 1.29715

I want to add two dataframes. In a way that column C of df1 will be added to df2 should result in new dataframe That must be constant with respect to time.
Expected dataframe df3 will be as below

            time                open    high    low  c  
    0   2020-09-16 22:54:00 1.29708 1.29711 1.29695 1.287623
    1   2020-09-16 22:55:00 1.29698 1.29703 1.29681 1.287623
    2   2020-09-17 22:56:00 1.29701 1.29709 1.29689 1.294943
    3   2020-09-17 22:57:00 1.29702 1.29720 1.29701 1.294943
    4   2020-09-17 22:58:00 1.29717 1.29720 1.29715 1.294943

C is 1.28763 for row 0 and 1 because c is 1.28763 for the date 2020-09-16.
C is 1.294943 for row 2.3 and 4 because c is 1.294943 for the date 2020-09-17.

Upvotes: 1

Views: 97

Answers (2)

wwnde
wwnde

Reputation: 26686

Lets try map the dict of df1.date: df1.C to the extract of date in df2.time.

df2['C']=(pd.to_datetime(df2.time).dt.date).astype(str).map(dict(zip(df1.date,df1.C)))

How it works

#Extract date from df2.time

df2['temp']=pd.to_datetime(df2.time).dt.date



#Create dict from df1.date and df1.C
 D=dict(zip(df1.date,df1.C))

 #Create new column df['C'] by mapping D to df2.temp

df2.temp.map(D)

Outcome

               time     open     high      low         C
0  2020-09-16 22:54:00  1.29708  1.29711  1.29695  1.287623
1  2020-09-16 22:55:00  1.29698  1.29703  1.29681  1.287623
2  2020-09-17 22:56:00  1.29701  1.29709  1.29689  1.294943
3  2020-09-17 22:57:00  1.29702  1.29720  1.29701  1.294943
4  2020-09-17 22:58:00  1.29717  1.29720  1.29715  1.294943

Alternatively as suggested by @Erfan

#Rename columns of df1 as follows

    df1=df1[["date", "C"]].rename(columns={"date": "time"})

#Coerce df2.time to date
     df2['time']=pd.to_datetime(df2['time']).dt.date


#Merge df2 and df1
     df2.merge(df1, how='left')

Upvotes: 1

muzzyq
muzzyq

Reputation: 904

First make sure that df['time'] is of datetime64 type

df2['time'] = df2.time.astype('datetime64')

Then extract the date into a new column

df2['date'] = df2.time.dt.date.astype('datetime64')

Then merge the two dataframes, selecting only date and C from df (if C is all you're interested in keeping).

df3 = pd.merge(df2, df[['date', 'C']], on='date')

Finally you can drop the date column we added in step 2

df3 = df3.drop(columns='date')

Upvotes: 2

Related Questions