Reputation: 33
In case I have a dataframe as
Index values
1 2
2 4
3 3
4 nan
5 nan
6 nan
7 9
8 7
9 8
10 nan
11 5
12 1
13 6
How can I set new columns in a new dataframe for each vector with values different than nan? The solution would something like the following:
Index values_1 values_2 values_3
1 2 9 5
2 4 7 1
3 3 8 6
Upvotes: 1
Views: 81
Reputation: 16683
cumsum()
of values that are null to separate into groups, which is how you create columns. This works because it returns False
for non-null rows, so when you take the cumsum()
, you get the first group all as group 0
, and the other groups will be separated into different group IDs as well.dropna()
(df['col1'].ne(df['col1'].shift())).cumsum()
df1.groupby('col1').cumcount() + 1
, in order fror the rows to be on the same line.pivot
defining an index
and columns
.# df1 = df1.drop('Index', axis=1) #include this line of code if "Index" is a column. Otherwise, omit.
df['col1'] = (df['values'].isnull().cumsum())
df = df.dropna()
df['col1'] = (df['col1'].ne(df['col1'].shift())).cumsum()
df['col2'] = df.groupby('col1').cumcount()
df = (df.pivot(index='col2', columns='col1').reset_index(drop=True)
.add_prefix('values_').droplevel(0, axis=1))
# can also use # df.columns = [f'{col[0]}_{col[1]}' for col in df.columns] # instead of # .add_prefix('values_').droplevel(0, axis=1))
df
Out[1]:
values_1 values_2 values_3
0 2.0 9.0 5.0
1 4.0 7.0 1.0
2 3.0 8.0 6.0
Upvotes: 2
Reputation: 23099
we can use a boolean to create a new index then use unstack
and add_prefix
df['key'] = ((df['values'].isna()==True) &
(df['values'].shift(1).isna()==False)).cumsum() + 1
df = df.dropna()
df1 = df.set_index([df.groupby(['key']).cumcount(),'key'])\
.unstack(1)\
.add_prefix('values_')\
.droplevel(0,1)
print(df1)
key values_1 values_2 values_3
0 2.0 9.0 5.0
1 4.0 7.0 1.0
2 3.0 8.0 6.0
Upvotes: 2