Bill Software Engineer
Bill Software Engineer

Reputation: 7782

Append all columns from one row into another row

I am trying to append every column from one row into another row, I want to do this for every row, but some row will not have any values, take a look at my code it will be more clear:

Here is my data

date    day_of_week day_of_month    day_of_year month_of_year
5/1/2017    0   1   121 5
5/2/2017    1   2   122 5
5/3/2017    2   3   123 5
5/4/2017    3   4   124 5
5/8/2017    0   8   128 5
5/9/2017    1   9   129 5
5/10/2017   2   10  130 5
5/11/2017   3   11  131 5
5/12/2017   4   12  132 5
5/15/2017   0   15  135 5
5/16/2017   1   16  136 5
5/17/2017   2   17  137 5
5/18/2017   3   18  138 5
5/19/2017   4   19  139 5
5/23/2017   1   23  143 5
5/24/2017   2   24  144 5
5/25/2017   3   25  145 5
5/26/2017   4   26  146 5

Here is my current code:

s = df_md['date'].shift(-1)
df_md['next_calendarday']  = s.mask(s.dt.dayofweek.diff().lt(0))
df_md.set_index('date', inplace=True)
df_md.apply(lambda row: GetNextDayMarketData(row, df_md), axis=1)

def GetNextDayMarketData(row, dataframe):
    if(row['next_calendarday'] is pd.NaT):
        return
    key = row['next_calendarday'].strftime("%Y-%m-%d")
    nextrow = dataframe.loc[key]
    for index, val in nextrow.iteritems():
        if(index != "next_calendarday"):
            dataframe.loc[row.name, index+'_nextday'] = val

This works but it's so slow it might as well not work. Here is what the result should look like, you can see that the value from the next row has been added to the previous row. The kicker is that it's the next calendar date and not just the next row in the sequence. If a row does not have an entry for next calendar date, it will simply be blank.

enter image description here

Here is the expected result in csv

date    day_of_week day_of_month    day_of_year month_of_year   next_workingday day_of_week_nextday day_of_month_nextday    day_of_year_nextday month_of_year_nextday
5/1/2017    0   1   121 5   5/2/2017    1   2   122 5
5/2/2017    1   2   122 5   5/3/2017    2   3   123 5
5/3/2017    2   3   123 5   5/4/2017    3   4   124 5
5/4/2017    3   4   124 5                   
5/8/2017    0   8   128 5   5/9/2017    1   9   129 5
5/9/2017    1   9   129 5   5/10/2017   2   10  130 5
5/10/2017   2   10  130 5   5/11/2017   3   11  131 5
5/11/2017   3   11  131 5   5/12/2017   4   12  132 5
5/12/2017   4   12  132 5                   
5/15/2017   0   15  135 5   5/16/2017   1   16  136 5
5/16/2017   1   16  136 5   5/17/2017   2   17  137 5
5/17/2017   2   17  137 5   5/18/2017   3   18  138 5
5/18/2017   3   18  138 5   5/19/2017   4   19  139 5
5/19/2017   4   19  139 5                   
5/23/2017   1   23  143 5   5/24/2017   2   24  144 5
5/24/2017   2   24  144 5   5/25/2017   3   25  145 5
5/25/2017   3   25  145 5   5/26/2017   4   26  146 5
5/26/2017   4   26  146 5                   
5/30/2017   1   30  150 5   

            

Upvotes: 2

Views: 106

Answers (1)

jezrael
jezrael

Reputation: 862471

Use DataFrame.join with remove column next_calendarday_nextday:

df = df.set_index('date')
df = (df.join(df, on='next_calendarday', rsuffix='_nextday')
        .drop('next_calendarday_nextday', axis=1))

Upvotes: 2

Related Questions