Reputation: 95
I have table:
Name1 Name2 Name3
0 ABC FGD NNY
1 111S PC 1T Trees are always yellow NaN NaN
2 P FGD NNY
3 JJJ FGD NNY
4 111S PC 1T Trees are always yellow NaN NaN
5 ABC FGD NNY
6 UIK GJ DE
and i want to get this:
Name1 Name2 Name3 Name4
0 ABC FGD NNY NaN
1 111S PC 1T Trees are always yellow
2 P FGD NNY NaN
3 JJJ FGD NNY NaN
4 111S PC 1T Trees are always yellow
5 ABC FGD NNY NaN
6 UIK GJ DE NaN
I need to split only some rows and other rows should not change. I was able to determine the lines in which it is necessary to split the data:
if df[colname1].isnull:
df_index=df[df[colname1].isnull()].index
print(df_index)
Now need to separate values in strings. I get somthing like that:
if df[colname1].isnull:
df_index=df[df[colname1].isnull()].index
print(df_index)
for i in df_index:
print(i)
df1=df[colname][i].split(' ')
df1 is string with needful information for me, but i don't know how put this info to DataFrame df in needful index. Could you help me with this.
Upvotes: 0
Views: 36
Reputation: 3855
IIUC you have a double whitespace to delimite your columns, and single whitespace inside your sentences. You can use that to perform your split.
idx = df.loc[df.Name2.isnull()].index
df['Name4'] = np.nan
df.loc[idx] = df.loc[idx].Name1.str.split(' ',expand = True).values
Name1 Name2 Name3 Name4
0 ABC FGD NNY NaN
1 111S PC 1T Trees are always yellow
2 P FGD NNY NaN
3 JJJ FGD NNY NaN
4 111S PC 1T Trees are always yellow
5 ABC FGD NNY NaN
6 UIK GJ DE NaN
Upvotes: 0
Reputation: 323316
Using str.split
with n
s=df.fillna('').apply(' '.join,1)
s.str.split(' ',n=3)
Out[189]:
0 [ABC, FGD, NNY]
1 [111S, PC, 1T, Trees are always yellow ]
2 [P, FGD, NNY]
3 [JJJ, FGD, NNY]
4 [111S, PC, 1T, Trees are always yellow ]
5 [ABC, FGD, NNY]
6 [UIK, GJ, DE]
dtype: object
pd.DataFrame(s.str.split(' ',n=3).tolist())
Out[190]:
0 1 2 3
0 ABC FGD NNY None
1 111S PC 1T Trees are always yellow
2 P FGD NNY None
3 JJJ FGD NNY None
4 111S PC 1T Trees are always yellow
5 ABC FGD NNY None
6 UIK GJ DE None
Upvotes: 1