Reputation: 45
Suppose I have a dataframe looking something like this:
col1 col2 col3 col4
0 A B F O
1 A G Q
2 A C G P
3 A H
4 A D I
5 A D I
6 A J U
7 A E J
How can I shift the columns if the column value is empty?
col1 col2 col3 col4
0 A B F O
1 A G Q
2 A C G P
3 A H
4 A D I
5 A D I
6 A J U
7 A E J
I thought I could check current column, if it's empty, take the next column value and make that empty.
for col in df.columns:
df[col] = np.where((df[col] == ''), df[f'col{int(col[-1])+1}'], df[col])
df[f'col{int(col[-1])+1}'] = np.where((df[col] == ''), '', df[col])
But I am failing somewhere. Sample df below.
df = pd.DataFrame(
{
'col1': ['A','A','A','A','A','A','A','A'],
'col2': ['B','','C','','D','D','','E'],
'col3': ['F','G','G','H','I','I','J',''],
'col4': ['O','Q','P','','','','U','J']
}
)
Upvotes: 1
Views: 1155
Reputation: 10960
Replace empty string with NaN
df = df.replace('', np.nan)
Apply dropna
row-wise
odf = df.apply(lambda x: pd.Series(x.dropna().values), axis=1)
To retain column names,
odf.columns = df.columns
NOTE: It is always good to represent missing data with NaN
Output
col1 col2 col3 col4
0 A B F O
1 A G Q NaN
2 A C G P
3 A H NaN NaN
4 A D I NaN
5 A D I NaN
6 A J U NaN
7 A E J NaN
Upvotes: 3
Reputation: 18426
You can count the number of empty values for a column, then sort it, and finally get the desired datframe..
counts = {}
for col in df.columns.to_list():
counts[col] = (df[col]== '').sum() #Based on the example you have provided.
# Then sort the dictionary based on counts.
counts = dict(sorted(counts.items(), key=lambda item: item[1]))
#Assign back to the dataframe.
df = df[[*counts.keys()]]
df
col1 col3 col2 col4
0 A F B O
1 A G Q
2 A G C P
3 A H
4 A I D
5 A I D
6 A J U
7 A E J
Upvotes: 0
Reputation: 150765
One way is to use np.argsort
:
s = df.to_numpy()
orders = np.argsort(s=='', axis=1, kind='mergesort')
df[:] = s[np.arange(len(s))[:,None],orders]
Output:
col1 col2 col3 col4
0 A B F O
1 A G Q
2 A C G P
3 A H
4 A D I
5 A D I
6 A J U
7 A E J
Note: A very similar approach can be found in this question.
Upvotes: 2