Reputation: 297

edit columnnames that include duplicate special characters

I have some column names that include two question marks at different spaces e.g. 'how old were you? when you started university?' - i need to identify which columns have two question marks in. any tips welcome! thanks

data

df = pd.DataFrame(data={'id': [1, 2, 3, 4, 5], 'how old were you? when you started university?': [1,2,3,4,5], 'how old were you when you finished university?': [1,2,3,4,5], 'at what age? did you start your first job?': [1,2,3,4,5]})

expected output

df1 = pd.DataFrame(data={'id': [1, 2, 3, 4, 5], 'how old were you when you finished university?': [1,2,3,4,5]})

Upvotes: 1

Answers (4)

not_speshal

Reputation: 23146

If you want to get all columns that have more than one question mark, you can use the following:

[c for c in df.columns if c.count("?")>1]

Edit: If you want to replace the extra "?" but keep the ending "?", use this:

df.rename(columns = {c: c.replace("?", "")+"?" for c in df.columns if c.find("?")>0})

Upvotes: 2

Andrej Kesely

Reputation: 195448

You can use boolean-indexing:

x = df.loc[:, df.columns.str.count(r"\?") < 2]
print(x)

Prints:

   id  how old were you when you finished university?
0   1                                               1
1   2                                               2
2   3                                               3
3   4                                               4
4   5                                               5

Upvotes: 3

Anurag Dhadse

Reputation: 1873

df = df.drop([col for col in df.columns if col.count("?")>1], axis=1)

Upvotes: 1

jezrael

Reputation: 862761

One idea with list comprehension:

df = df[[c for c in df.columns if c.count("?") < 2]]
print (df)
   id  how old were you when you finished university?
0   1                                               1
1   2                                               2
2   3                                               3
3   4                                               4
4   5                                               5

Upvotes: 4

edit columnnames that include duplicate special characters

Answers (4)

Related Questions