Reputation: 193
Some joker made a Lotus database/applet thingy for tracking engineering issues in our company. The joke is that the key piece of information was named with a special character... a number sign (hash tag, pound sign, \u0023).
abbreviated sample:
KA# Issue Date Current Position
27144 1/9/2014 Accounting
27194 12/20/2012 Engineering
32474 4/21/2008 Engineering
32623-HOLD 4/25/2016 Engineering
32745 11/13/2012 SEPE
32812 10/30/2013 Engineering
32817 12/7/2012 Purchasing
32839 1/8/2013 SEPE
I output this table (4K rows, 15 columns) to a csv file and process in python3 as a pandas dataframe.
I generate various outputs. If I use something like:
df.iloc[:,[0,3,1,8,9,10]]
I get appropriate output and the key column shows up as "KA#"
. (When I say "key column", I mean "most important"... NOT "index". I keep a serial index)
Unfortunately, people sometimes mess with the column order in Lotus between my exports to csv so I can not guarantee that "KA#"
will be any particular column number. I would like to use column names:
df.loc[:,["KA#","Issue Date","Current Position"]]
But the "KA#"
column is filled with NaN's.
Thanks for any help you can offer.
Finally, if I try to rename "KA#"
to simply "KA"
:
df['KA#'].name = 'KA'
throws a KeyError and
df = df.rename(columns={"KA#": "ka"})
is completely ignored. The column shows up as "KA#"
.
Can anyone think of a way to get rid of or handle that symbol? I'd even settle for a regex at this point.
Upvotes: 19
Views: 30920
Reputation: 27
Instead we can use lambda functions for removing special characters in the column like:
df2 = df1.rename(columns=lambda x: x.strip('*'))
Upvotes: 1
Reputation: 7838
use str.replace:
df.columns=df.columns.str.replace('#','')
You can check this in the documentation.
Upvotes: 29