AbdulazizAlghamdi
AbdulazizAlghamdi

Reputation: 3

how to find specific values in a column

I have this dataframe:

    yellow_col green_col  red_col blue_col 
    
    yellow      none       none    none
    
    none        green      none    none
    
    none        none       red     none

I want to find and place the values in a new column like this:

 

    yellow_col green_col  red_col blue_col  color
    
    yellow      none       none    none     yellow
    
    none        green      none    none     green
    
    none        none       red     none     red

I tried many ways and this is the last

    for i in range(len(df_arch_copy)):
        x=df_arch_copy.query('red_col=="red"or green_col=="green"or blue_col=="blue"or yellow_col=="yellow"').index
        df_arch_copy.color.insert(x,inplace=True)

Upvotes: 0

Views: 75

Answers (2)

Mahmoud Aly
Mahmoud Aly

Reputation: 751

Try this:

import pandas as pd
colors=[]
for row in df.values:
  for color in row:
    if color!='none':
      colors.append(color)
df['color']=colors
print(df)     
  yellow_col green_col red_col blue_col   color
0     yellow      none    none     none  yellow
1       none     green    none     none   green
2       none      none     red     none     red

Upvotes: 0

Gene Burinsky
Gene Burinsky

Reputation: 10223

Method 1: use the .fillna() method on the color column.

# make dataframe (copied the one in the question)
df = pd.read_clipboard()

# replace Nones with NAs
df.replace('none', np.nan,inplace=True)

# initiate color column
df.loc[:,'color'] = df.loc[:,'yellow_col']

# fill up the remaining values from the other columns
for col in ['red', 'green', 'blue']:
      df.color.fillna(df[col+'_col'],inplace=True)

Result:

  yellow_col green_col red_col  blue_col   color
0     yellow       NaN     NaN       NaN  yellow
1        NaN     green     NaN       NaN   green
2        NaN       NaN     red       NaN     red

Method 2: take advantage of pandas' ability to sum (i.e. concatenate) strings.

In this case, convert your columns to strings and simply sum across the rows as follows:

df.loc[:,'color'] = df.astype('str').sum(axis=1).replace('none','',regex=True)

Details: df.astype('str') converts the fields to strings, .sum(axis=1) adds the rows together, and .replace('none','',regex=True) takes out the 'none' from the concatenated strings.

Upvotes: 2

Related Questions