Mark K
Mark K

Reputation: 9348

Python, Pandas to add columns from calculation

A data frame like this and I am adding some columns from mapping and calculation.

   code month of entry   name  reports
0    JJ       20171002  Jason       14
1    MM       20171206  Molly       24
2    TT       20171208   Tina       31
3    JJ       20171018   Jake       22
4    AA       20090506    Amy       34
5    DD       20171128  Daisy       16
6    RR       20101216  River       47
7    KK       20171230   Kate       32
8    DD       20171115  David       14
9    JJ       20171030   Jack       10
10   NN       20171216  Nancy       28

What it is doing here is select some rows and look up the values from the dictionary and insert a further column from simple calculation. It works fine:

import pandas as pd

data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy', 'Daisy', 'River', 'Kate', 'David', 'Jack', 'Nancy'], 
    'code' : ['JJ', 'MM', 'TT', 'JJ', 'AA', 'DD', 'RR', 'KK', 'DD', 'JJ', 'NN'],
    'month of entry': ["20171002", "20171206", "20171208", "20171018", "20090506", "20171128", "20101216", "20171230", "20171115", "20171030", "20171216"],
    'reports': [14, 24, 31, 22, 34, 16, 47, 32, 14, 10, 28]}
df = pd.DataFrame(data)

dict_hour = {'JasonJJ' : 3, 'MollyMM' : 6, 'TinaTT' : 2, 'JakeJJ' : 3, 'AmyAA' : 8, 'DaisyDD' : 6, 'RiverRR' : 4, 'KateKK' : 8, 'DavidDD' : 5, 'JackJJ' : 5, 'NancyNN' : 2}

wanted = ['JasonJJ', 'TinaTT', 'AmyAA', 'DaisyDD', 'KateKK']

df['name_code'] = df['name'].astype(str) + df['code'].astype(str)

df1 = df[df['name_code'].isin(wanted)]

df1['hour'] = df1['name_code'].map(dict_hour).astype(float)

df1['coefficient'] = df1['reports'] / df1['hour'] - 1

But the last 2 lines received a same warning:

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

How can the code can be improved accordingly? Thank you.

Upvotes: 1

Views: 84

Answers (1)

jezrael
jezrael

Reputation: 862601

You need copy:

df1 = df[df['name_code'].isin(wanted)].copy()

If you modify values in df1 later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.

Upvotes: 3

Related Questions