Reputation: 25
I am a newbie at python and programming in general. I hope the following question is well explained.
I have a big dataset, with 80+ columns and some of these columns have only data on a weekly basis. I would like transform these columns to have values on a daily basis by simply dividing the weekly value by 7 and attributing the result to the value itself and the 6 other days of that week.
This is what my input dataset looks like:
date col1 col2 col3
02-09-2019 14 NaN 1
09-09-2019 NaN NaN 2
16-09-2019 NaN 7 3
23-09-2019 NaN NaN 4
30-09-2019 NaN NaN 5
07-10-2019 NaN NaN 6
14-10-2019 NaN NaN 7
21-10-2019 21 NaN 8
28-10-2019 NaN NaN 9
04-11-2019 NaN 14 10
11-11-2019 NaN NaN 11
..
This is what the output should look like:
date col1 col2 col3
02-09-2019 2 NaN 1
09-09-2019 2 NaN 2
16-09-2019 2 1 3
23-09-2019 2 1 4
30-09-2019 2 1 5
07-10-2019 2 1 6
14-10-2019 2 1 7
21-10-2019 3 1 8
28-10-2019 3 1 9
04-11-2019 3 2 10
11-11-2019 3 2 11
..
I can´t come up with a solution, but here is what I thought might work:
def convert_to_daily(df):
for column in df.columns.tolist():
if column.isna(): # if true
for line in range(len(df[column])):
# check if value is not empty and
succeeded by an 6 empty values or some
better logic
# I don´t know how to do that.
Upvotes: 2
Views: 294
Reputation: 862541
I believe you need select columns contains at least one missing value, forward filling missing values and divide by 7
:
m = df.isna().any()
df.loc[:, m] = df.loc[:, m].ffill(limit=7).div(7)
print (df)
date col1 col2 col3
0 02-09-2019 2.0 NaN 1
1 09-09-2019 2.0 NaN 2
2 16-09-2019 2.0 1.0 3
3 23-09-2019 2.0 1.0 4
4 30-09-2019 2.0 1.0 5
5 07-10-2019 2.0 1.0 6
6 14-10-2019 2.0 1.0 7
7 21-10-2019 3.0 1.0 8
8 28-10-2019 3.0 1.0 9
9 04-11-2019 3.0 2.0 10
10 11-11-2019 3.0 2.0 11
Upvotes: 1