pkpto39
pkpto39

Reputation: 545

Add New column not containing value in previous column

I've seen several solutions to this, but for some reason I just cant get this working.

I have data that looks like this.

data = {'Col1': ['Apples', 'Bananna, Pear', 'Suns, Moons, Stars', 'Orange, Blue, Green']}
df = pd.DataFrame(data)

                  Col1                 Col2
0               Apples               Apples
1        Bananna, Pear        Bananna, Pear
2   Suns, Moons, Stars   Suns, Moons, Stars
3  Orange, Blue, Green  Orange, Blue, Green

And I would like it to look like this.

EDIT: Explain logic of col2

The goal is that if col1 does not have a comma in it (,) then I'd like to add that value to col2. Eventually, I'll fill in the None values down with what is in col2

goal_data = {'Col1': ['Apples', 'Bananna, Pear', 'Suns, Moons, Stars', 'Orange, Blue, Green'],
             'Col2': ['Apples', None, None, None]}

goal_df = pd.DataFrame(goal_data)
                 Col1    Col2
0               Apples  Apples
1        Bananna, Pear    None
2   Suns, Moons, Stars    None
3  Orange, Blue, Green    None

I've tried a few variations of this for loop

for i in df['Col1']:
    if ',' in i:
        None
    else:
        df['Col2'] = df['Col1']

But I keep running into issues where the entire 'col1' is either copied, or everything in col2 is set to None or Apples.

In my actual data, I have thousands of rows that repeats this pattern. When I run different versions of this, it will usually set col2 to whatever the last col1 value was that did not have a comma in it.

Upvotes: 0

Views: 40

Answers (1)

kelvt
kelvt

Reputation: 1038

Let's try:

f = lambda x: None if ',' in x else x

df['Col2'] = df['Col1'].apply(f, axis=1)

Upvotes: 1

Related Questions