Reputation: 545
I've seen several solutions to this, but for some reason I just cant get this working.
I have data that looks like this.
data = {'Col1': ['Apples', 'Bananna, Pear', 'Suns, Moons, Stars', 'Orange, Blue, Green']}
df = pd.DataFrame(data)
Col1 Col2
0 Apples Apples
1 Bananna, Pear Bananna, Pear
2 Suns, Moons, Stars Suns, Moons, Stars
3 Orange, Blue, Green Orange, Blue, Green
And I would like it to look like this.
EDIT: Explain logic of col2
The goal is that if col1
does not have a comma in it (,) then I'd like to add that value to col2
. Eventually, I'll fill in the None
values down with what is in col2
goal_data = {'Col1': ['Apples', 'Bananna, Pear', 'Suns, Moons, Stars', 'Orange, Blue, Green'],
'Col2': ['Apples', None, None, None]}
goal_df = pd.DataFrame(goal_data)
Col1 Col2
0 Apples Apples
1 Bananna, Pear None
2 Suns, Moons, Stars None
3 Orange, Blue, Green None
I've tried a few variations of this for loop
for i in df['Col1']:
if ',' in i:
None
else:
df['Col2'] = df['Col1']
But I keep running into issues where the entire 'col1' is either copied, or everything in col2
is set to None
or Apples
.
In my actual data, I have thousands of rows that repeats this pattern. When I run different versions of this, it will usually set col2
to whatever the last col1
value was that did not have a comma in it.
Upvotes: 0
Views: 40
Reputation: 1038
Let's try:
f = lambda x: None if ',' in x else x
df['Col2'] = df['Col1'].apply(f, axis=1)
Upvotes: 1