Felix
Felix

Reputation: 1579

Pandas New Column Calculation Based on Existing Columns Values

source table

Please see above my source table:

I am interesting to calculate new column "Group" based on list of interest lots:

List of interest

Value of column "Group" based on condition if lots in source table in column "Lot" exist in lots of interest. If this is not true, value in "Group"column will be copied from "LOT_VIRTUAL_LINE" cell

Desired output:

enter image description here

Upvotes: 0

Views: 1813

Answers (2)

Alexander
Alexander

Reputation: 109526

Because this question is tagged Pandas, I assume we are talking dataframes and series instead of plain lists. You can use loc to locate the rows and columns that match you criteria (e.g. whether each element in the LOT column isin the series of lots of interest).

df = pd.DataFrame({'LOT': ['A1111', 'A2222', 'A3333', 'B1111', 'B2222', 'B3333'], 
                   'LOT_VIRTUAL_LINE': ['AAA'] * 3 + ['BBB'] * 3})
s = pd.Series(['A1111', 'B2222'], name='Lots Of Interest')
# or... df2 = pd.read_csv('file_path/file_name.csv')

# Value of 'GROUP' defaults to 'LOT_VIRTUAL_LINE'.
df['GROUP'] = df.LOT_VIRTUAL_LINE

# But gets overwritten by 'LOT' if it is in the 'Lots of Interest' series.
mask = df.LOT.isin(s)
# or... mask = df.LOT.isin(df2['Lots of Interest'])  # Whatever the column name is.
df.loc[mask, 'GROUP'] = df.loc[mask, 'LOT']

# Confirm results.
>>> df
     LOT LOT_VIRTUAL_LINE  GROUP
0  A1111              AAA  A1111
1  A2222              AAA    AAA
2  A3333              AAA    AAA
3  B1111              BBB    BBB
4  B2222              BBB  B2222
5  B3333              BBB    BBB

Upvotes: 1

Stefan
Stefan

Reputation: 42875

Assuming we have a list named lots_of_interest, for instance as the result of read_csv(path).loc[:, 'lots_of_interest'].tolist():

df['Group'] = df.apply(lambda x: x['LOT'] if x['LOT'].isin(lots_of_interest) else x['LOT_VIRTUAL_LINE'], axis=1)

Upvotes: 1

Related Questions