Reputation: 23
I created this dataframe with python pandas:
import numpy as np
import pandas as pd
my_df = pd.DataFrame(
{'ColumnA':['Value A', '', 'Value B', '', '', 'Value C',''],
'ColumnB':['', '', '', '', '', '', '']})
The dataframe is presented below:
ColumnA | ColumnB
0 Value A |
1 |
2 Value B |
3 |
4 |
5 Value C |
6 |
To complete column B, I put this condition:
conditions = [
my_df['ColumnA'] == '',
my_df['ColumnA'] != '']
result = [my_df['ColumnA'].shift(1),
my_df['ColumnA']]
my_df['ColumnB'] = np.select(conditions, result)
I have this as a result:
ColumnA | ColumnB
0 Value A | Value A
1 | Value A
2 Value B | Value B
3 | Value B
4 |
5 Value C | Value C
6 | Value C
Now I want all the cells in columnB to be filled in like this:
ColumnA | ColumnB
0 Value A | Value A
1 | Value A
2 Value B | Value B
3 | Value B
4 | Value B
5 Value C | Value C
6 | Value C
Thank you for your suggestions!
Upvotes: 0
Views: 38
Reputation: 13242
Using real NaN
values instead of empty strings makes many things easier...
import pandas as pd
import numpy as np
data = {'ColumnA':['Value A', '', 'Value B', '', '', 'Value C',''],
'ColumnB':['', '', '', '', '', '', '']}
df = pd.DataFrame(data)
# Fix your null values:
df = df.replace('', np.nan)
# You appear to want ColumnB to be ColumnA if it were forward-filled.
df['ColumnB'] = df['ColumnA'].ffill()
print(df)
Output:
ColumnA ColumnB
0 Value A Value A
1 NaN Value A
2 Value B Value B
3 NaN Value B
4 NaN Value B
5 Value C Value C
6 NaN Value C
Upvotes: 1