Reputation: 342

how to replace blank space with "_" on every element's list - Python

I imported a .csv file with this command:

 mydata = pd.read_csv(file ,sep='\t' , engine='python' , dtype = {'Day' : np.datetime64 , 'Year' : np.int}  )

But i noticed than some of the column name has blank spaces like Account id instead of Account_id

Now i got the list of my columns name with this:

dwb_col= data.columns

And i'd like to replace blank spaces " " with "_" sign on every column name (i.e. every dwb_col element).

in order to rename the columns in this way: mydata.columns = [my_new_columns_list]

How i can do the find and replace part?
Is there any workaround/shortcut during the importing fase that let me collect the column name with "_"(underscore sign) over the " " (space) ?

Upvotes: 7

Answers (3)

Reputation: 23099

Another way would be using regex \s+ which will match 1 or more white spaces whilst ' ' will only match one

dwb_col = df.columns.str.replace('\s+', '_')

then just re-assign

df.columns = dwb_col

if you have trailing or leading white space you want to remove first you can add a str.strip

df.columns.str.strip().str.replace('\s+', '_')

regarding number 2 you can import your file and use the nrows argument to only collect the top n rows to gather the column names.

col_df = pd.read_csv(data,nrows=1)
cols = [col for col in col_df.columns.tolist() if '_' in col]

then read your data with usecols

df = pd.read_csv(data,usecols=cols)

Upvotes: 4

Reputation: 962

This will do, using str.replace:

df.columns = df.columns.str.replace(" ", "_")

Upvotes: 6

Reputation: 4130

Try this,assume your column names are like this

l = ["hello world","hello cat"]
cols = ['_'.join(i.split()) for i in l]
#outout
['hello_world', 'hello_cat']

Upvotes: 3