mqn
mqn

Reputation: 53

vaex column name change

Hi I'm just getting started with Vaex in Python. I have a dataset with messy column names. I'm trying to replace spaces with '_'.

In pandas I'm able to df.column = df.columns.str.replace(' ', '_')

but in Vaex

df_column = df.column_names.str.replace('\s', '_', regex=True)

I get the following error


AttributeError Traceback (most recent call last) in ----> 1 df_new = df.column_names.str.replace('\s', '_', regex=True) AttributeError: 'list' object has no attribute 'str'

does anyone know what I may be doing wrong?

Thanks Mike

Upvotes: 1

Views: 5354

Answers (1)

Joco
Joco

Reputation: 813

In Vaex the columns are in fact "Expressions". Expressions allow you do build sort of a computational graph behind the scenes as you are doing your regular dataframe operations. However, that requires the column names to be as "clean" as possible.

So column names like '2', or '2.5' are not allows, since the expression system can interpret them as numbers rather than column names. Also column names like 'first-name', the expressions system can interpret as df['first'] - df['name'].

To avoid this, vaex will smartly rename columns so that they can be used in the expression system. This is extremely complicated actually. Btw, you can always access the original names via df.get_column_names(alias=True).

If you want to rename columns, you should use df.rename(name, new_name)

I hope this helps!

Upvotes: 6

Related Questions