Remove special characters from column headers

Question

I have a dictionary (data_final) of dataframes (health, education, economy,...). The dataframes contain data from one xlsx file. In one of the dataframes (economy), the column names have brackets and single quotes added to it.

data_final['economy'].columns = 
Index([                                ('Sr.No.',),
                                 ('DistrictName',),
                                  ('Agriculture',),
                            ('Forestry& Logging',),
                                      ('Fishing',),
                            ('Mining &Quarrying',),
                            ('ManufacturingMFG.',),
                               ('RegisteredMFG.',),
                                 ('Unregd. MFG.',),
                   ('Electricity,Gas & W.supply',),
                                 ('Construction',),
                    ('Trade,Hotels& Restaurants',),
                                     ('Railways',),
                      ('Transportby other means',),
                                      ('Storage',),
                                ('Communication',),
                           ('Banking &Insurance',),
       ('Real, Ownership of Dwel. B.Ser.& Legal',),
                         ('PublicAdministration',),
                                ('OtherServices',),
                                     ('TotalDDP',),
                           ('Population(In '00)',),
                        ('Per CapitaIncome(Rs.)',)],
      dtype='object')

I cannot reference any column using

data_final['economy']['('Construction',)']

gives error -

SyntaxError: invalid syntax

I tried to use replace to remove the brackets -

data_final['economy'].columns = pd.DataFrame(data_final['economy'].columns).replace("(","",regex=True))

But this does not remove the error in column names. How can i remove all these special characters from column names?

Ed. · Accepted Answer

It looks as though your column names are being imported/created as tuples. What happens if you try and reference them removing the brackets, but leaving a comma on the end, like so

data_final['economy']['Construction',]

or even with the brackets

data_final['economy'][('Construction',)]

Remove special characters from column headers

Answers (2)

Related Questions