Why is column name in dataframe have symbols next to it?

Question

I am reading in a csv but when I take a closer look at the column names there is a weird symbol next to the first column name, can anyone help me get rid of this symbol?

How column names look now(not sure what the symbols next to 'year' mean:

['ï»¿year', 'sch', 'city', 'prop_id']

How I want column name to look:

['year', 'sch', 'city', 'prop_id']

my code so far:

import pandas as pd

path = ('file_path')

cameron_county = pd.read_table(path + '/2016_GCC_prelim_appraisal_info_20160630.txt',
                             encoding = 'latin1',error_bad_lines = False)

print(cameron_county.head(1))
print(cameron_county.columns)

thank you in advance.

EdChum · Accepted Answer

this looks like unciode BOM try

cameron_county = pd.read_table(path + '/2016_GCC_prelim_appraisal_info_20160630.txt',
                             encoding = 'utf-8',error_bad_lines = False)

See: https://en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding

ï»¿ is the CP1252 representation of the utf-8 BOM hex code: EF BB BF

Why is column name in dataframe have symbols next to it?

Answers (2)

Related Questions