Reputation: 1
A | B | |
---|---|---|
0 | Abel | 10 |
1 | Ada | 8 |
2 | Beed | 15 |
3 | Calvin | 13 |
4 | NaN | 6 |
5 | NaN | 17 |
6 | NaN | 35 |
7 | NaN | 12 |
8 | NaN | 20 |
9 | NaN | 19 |
I want to fill the cells with NaN values with the Non-NaN values circling from index 0 to index 3.
import pandas as pd
data = {'A': ['Abel', 'Ada', 'Beed', 'Calvin', np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
'B': [10, 8, 15, 13, 6, 17, 35, 12, 20, 19]}
df = pd.DataFrame(data)
fill_values = df['A'].dropna().tolist()
df['A'] = df['A'].fillna(pd.Series(fill_values * (len(df) // len(fill_values) + 1))[:len(df)])
print(df)
It filled the NaN cells but it didn't start circling from the first index. It filled it with the previous value.
Expected output
A | B | |
---|---|---|
0 | Abel | 10 |
1 | Ada | 8 |
2 | Beed | 15 |
3 | Calvin | 13 |
4 | Abel | 6 |
5 | Ada | 17 |
6 | Beed | 35 |
7 | Calvin | 12 |
8 | Abel | 20 |
9 | Ada | 19 |
Upvotes: 0
Views: 97
Reputation: 25393
Another possible solution, whose steps are:
df['A'].dropna()
removes any NaN
entries from column A
, and .to_list()
converts the remaining values into a list.
len(df)
calculates the total number of rows in df
.
(vals * (df_len // len(vals) + 1))
creates a list by repeating the list vals
enough times to cover the entire length of df
.
vals, df_len = df['A'].dropna().to_list(), len(df)
df['A'] = (vals * (df_len // len(vals) + 1))[:df_len]
Output:
A B
0 Abel 10
1 Ada 8
2 Beed 15
3 Calvin 13
4 Abel 6
5 Ada 17
6 Beed 35
7 Calvin 12
8 Abel 20
9 Ada 19
Upvotes: 0
Reputation: 262204
For a pure pandas approach, you could identify the NaNs, and perform a map
after enumerating the NaNs with cumsum
+mod
:
s = df['A'].dropna(ignore_index=True)
m = df['A'].isna()
df.loc[m, 'A'] = m[m].cumsum().sub(1).mod(len(s)).map(s)
Output:
A B
0 Abel 10
1 Ada 8
2 Beed 15
3 Calvin 13
4 Abel 6
5 Ada 17
6 Beed 35
7 Calvin 12
8 Abel 20
9 Ada 19
Intermediates:
# s = df['A'].dropna(ignore_index=True)
0 Abel
1 Ada
2 Beed
3 Calvin
Name: A, dtype: object
# dataframe
A B m cumsum-1 mod filled_A
0 Abel 10 False NaN NaN NaN
1 Ada 8 False NaN NaN NaN
2 Beed 15 False NaN NaN NaN
3 Calvin 13 False NaN NaN NaN
4 NaN 6 True 0.0 0.0 Abel
5 NaN 17 True 1.0 1.0 Ada
6 NaN 35 True 2.0 2.0 Beed
7 NaN 12 True 3.0 3.0 Calvin
8 NaN 20 True 4.0 0.0 Abel
9 NaN 19 True 5.0 1.0 Ada
As suggested in comments, it is also possible to use itertools
:
from itertools import cycle, islice
g = cycle(df['A'].dropna().tolist())
m = df['A'].isna()
df.loc[m, 'A'] = list(islice(g, m.sum()))
# A B
# 0 Abel 10
# 1 Ada 8
# 2 Beed 15
# 3 Calvin 13
# 4 Abel 6
# 5 Ada 17
# 6 Beed 35
# 7 Calvin 12
# 8 Abel 20
# 9 Ada 19
Or numpy.resize
:
m = df['A'].isna()
df.loc[m, 'A'] = np.resize(df.loc[~m, 'A'], m.sum())
# A B
# 0 Abel 10
# 1 Ada 8
# 2 Beed 15
# 3 Calvin 13
# 4 Abel 6
# 5 Ada 17
# 6 Beed 35
# 7 Calvin 12
# 8 Abel 20
# 9 Ada 19
Upvotes: 0