Reputation: 461
I have this kind of dataframe in Pandas :
NaN
1
NaN
452
1175
12
NaN
NaN
NaN
145
125
NaN
1259
2178
2514
1
On the other hand I have this other dataframe :
1
2
3
4
5
6
I would like to separate the first one into differents sub-dataframes like this:
DataFrame 1:
1
DataFrame 2:
452
1175
12
DataFrame 3:
DataFrame 4:
DataFrame 5:
145
125
DataFrame 6:
1259
2178
2514
1
How can I do that without a loop?
Upvotes: 1
Views: 387
Reputation: 210842
UPDATE: thanks to @piRSquared for pointing out that the solution above will not work for DFs/Series with non-numeric indexes. Here is more generic solution:
dfs = [x.dropna()
for x in np.split(df, np.arange(len(df))[df['column'].isnull().values])]
OLD answer:
IIUC you can do something like this:
Source DF:
In [40]: df
Out[40]:
column
0 NaN
1 1.0
2 NaN
3 452.0
4 1175.0
5 12.0
6 NaN
7 NaN
8 NaN
9 145.0
10 125.0
11 NaN
12 1259.0
13 2178.0
14 2514.0
15 1.0
Solution:
In [31]: dfs = [x.dropna()
for x in np.split(df, df.index[df['column'].isnull()].values+1)]
In [32]: dfs[0]
Out[32]:
Empty DataFrame
Columns: [column]
Index: []
In [33]: dfs[1]
Out[33]:
column
1 1.0
In [34]: dfs[2]
Out[34]:
column
3 452.0
4 1175.0
5 12.0
In [35]: dfs[3]
Out[35]:
Empty DataFrame
Columns: [column]
Index: []
In [36]: dfs[4]
Out[36]:
Empty DataFrame
Columns: [column]
Index: []
In [37]: dfs[4]
Out[37]:
Empty DataFrame
Columns: [column]
Index: []
In [38]: dfs[5]
Out[38]:
column
9 145.0
10 125.0
In [39]: dfs[6]
Out[39]:
column
12 1259.0
13 2178.0
14 2514.0
15 1.0
Upvotes: 2
Reputation: 294258
w = np.append(np.where(np.isnan(df.iloc[:, 0].values))[0], len(df))
splits = {'DataFrame{}'.format(c): df.iloc[i+1:j]
for c, (i, j) in enumerate(zip(w, w[1:]))}
Print out splits
to demonstrate
for k, v in splits.items():
print(k)
print(v)
print()
DataFrame0
0
1 1.0
DataFrame1
0
3 452.0
4 1175.0
5 12.0
DataFrame2
Empty DataFrame
Columns: [0]
Index: []
DataFrame3
Empty DataFrame
Columns: [0]
Index: []
DataFrame4
0
9 145.0
10 125.0
DataFrame5
0
12 1259.0
13 2178.0
14 2514.0
15 1.0
Upvotes: 1