Sai
Sai

Reputation: 87

Inserting a new Column in existing pandas dataframe

I'm working on an Machine Learning Assignment where I go over the bug database, do a multi-class classification and then insert a new column with the classified text. As part of debug, when I run that particular cell again, it says column already exists. I was just wondering if there is a way to get over it (other than the usual Exception handling).

The piece of code that I have written is as follows:

trigger_dict = {
    'Config-Change':['change','changing','changed'], \
    'Upgrade-Downgrade':['Upgrade','Downgrade','ISSU'], \
    'VPC-Related':['MCT','MCEC','VPC'], \
    'CLI-Related':['CC','Consistency','Checker','Show','Debug','Clear'], \
    'Interface-Flap': ['Flap','Shut'] ,\
    'Reload-Related': ['reload','reboot','ASCII','Replay'],\
    'Process-Related': ['Restart','Kill','Process'],\
    'ACL-Related': ['RACL','PACL','IFACL'],\
    'Config-Unconfig': ['config','remove','removal','Unconfig','reconfig'],\
    'HA-Related': ['SSO','LC','Switchover'],\
}


cat_1 = pd.Series([])
flag = 0

for index in range(df['Headline'].shape[0]):
    text = df['Headline'][index]
    for key, value in trigger_dict.items():
        for val in value:
            if re.search(val, text, re.I):
                if not flag:
                    cat_1[index] = key
                    flag = 1
    flag = 0
        
df.insert(len(df.columns),"Trigger_Type", cat_1)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-45-d23348f7bbac> in <module>
     12     flag = 0
     13 
---> 14 df.insert(len(df.columns),"Trigger_Type", cat_1)

~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/pandas/core/frame.py in insert(self, loc, column, value, allow_duplicates)
   3220         value = self._sanitize_column(column, value, broadcast=False)
   3221         self._data.insert(loc, column, value,
-> 3222                           allow_duplicates=allow_duplicates)
   3223 
   3224     def assign(self, **kwargs):

~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/pandas/core/internals.py in insert(self, loc, item, value, allow_duplicates)
   4336         if not allow_duplicates and item in self.items:
   4337             # Should this be a different kind of error??
-> 4338             raise ValueError('cannot insert {}, already exists'.format(item))
   4339 
   4340         if not isinstance(loc, int):

ValueError: cannot insert Trigger_Type, already exists

Upvotes: 1

Views: 2510

Answers (2)

MD. SHIFULLAH
MD. SHIFULLAH

Reputation: 1769

Here I'm providing code with output:

Code:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
    'Age': [25, 30, 22, 35, 28],
    'Salary': [50000, 60000, 45000, 70000, 55000]
}

df = pd.DataFrame(data)

df['Bonus'] = [1000, 1500, 800, 2000, 1200]  # Add a new column 'Bonus' with random bonus values
print("Updated DataFrame:")
print(df)

Output:

Updated DataFrame:
Name  Age  Salary  Bonus
0    Alice   25   50000   1000
1      Bob   30   60000   1500
2  Charlie   22   45000    800
3    David   35   70000   2000
4    Emily   28   55000   1200

Upvotes: 0

spo
spo

Reputation: 369

It's not working because you already have a column with that name. If you are ok with having duplicate columns then, you can pass allow_duplicates=True.

df.insert(len(df.columns),"Trigger_Type", cat_1, allow_duplicates=True)

Otherwise, you will have to rename the column to something else.

If you want to completely replace the column, you can also use:

df['Trigger_Type'] = cat1

Upvotes: 1

Related Questions