Reputation:
I recently stumbled upon the .assign()
dataframe method and love how it can cleanly express creating new columns. It's very intuitive to create columns that are functions of other columns and objects, however, assigning a string scalar returns NaN
for the entire column. This makes sense looking at the documentation that the method takes keyword arguments with a callable or series as values, but even when using a lambda to basically wrap the string into a function it returns a column of NaN
values.
str_scalar = "Hello"
df = df.assign(str_scalar_col = str_scalar)
# column str_scalar_col is all NaN
df = df.assign(str_scalar_col = lambda x: str_scalar)
# column str_scalar_col is still all NaN
Maybe this has to do with the type of the column created by default?
Normally I would just assign the column inplace, but curious if .assign()
can assign a string scalar column.
df['str_scalar'] = "Hello"
# column str_scalar is all "Hello"
Upvotes: 0
Views: 55
Reputation: 260865
df = df.assign(str_scalar_col = lambda: str_scalar)
can't work since pandas will pass the DataFrame as parameter. You should get a TypeError
.
The correct syntax to use a function/lambda would be:
df.assign(str_scalar_col = lambda x: str_scalar)
col str_scalar_col
0 0 Hello
1 1 Hello
2 2 Hello
3 3 Hello
4 4 Hello
If you want to create a variable output you can access the items with:
df.assign(str_scalar_col = lambda x: str_scalar + x['col'].astype(str))
col str_scalar_col
0 0 Hello0
1 1 Hello1
2 2 Hello2
3 3 Hello3
4 4 Hello4
Upvotes: 0