Reputation: 759
I am trying to use transform
with an anonymous function (x -> uppercase.(x)
) and store the new column as "A"
by specifying a target column name (:A
).
If I don't specify a target column variable (first transformation below), the new variable is produced fine (i.e. a Vector with 5 elements). However, once I specify the target column (second transformation below), the function returns a Vector of Pairs under the "a_function"
name.
How can I produce the desired DataFrame with a new column "A"
containing a Vector with 5 elements ("A" to "E")? Why does the second transformation below return a Vector of Pairs with a name different from that specifyed?
using DataFrames
df_1 = DataFrame(a = ["a", "b", "c", "d", "e"])
df_2 = transform(df_1, :a => x -> uppercase.(x)) # first transformation
df_2
Row │ a a_function
│ String String
─────┼────────────────────
1 │ a A
2 │ b B
3 │ c C
4 │ d D
5 │ e E
df_3 = transform(df_1, :a => x -> uppercase.(x) => :A) # second transformation
df_3
5×2 DataFrame
Row │ a a_function
│ String Pair…
─────┼───────────────────────────────────────
1 │ a ["A", "B", "C", "D", "E"]=>:A
2 │ b ["A", "B", "C", "D", "E"]=>:A
3 │ c ["A", "B", "C", "D", "E"]=>:A
4 │ d ["A", "B", "C", "D", "E"]=>:A
5 │ e ["A", "B", "C", "D", "E"]=>:A
Desired outcome DataFrame:
DataFrame(a = ["a", "b", "c", "d", "e"],
A = ["A", "B", "C", "D", "E"])
Upvotes: 3
Views: 489
Reputation: 69819
The reason is operator precedence, if you write:
julia> :a => x -> uppercase.(x) => :A
:a => var"#7#8"()
you see that you have defined only one pair. The whole part uppercase.(x) => :A
became the body of your anonymous function.
Instead write (note I added (
and )
around the anonymous function):
julia> :a => (x -> uppercase.(x)) => :A
:a => (var"#9#10"() => :A)
to get what you wanted:
julia> df_3 = transform(df_1, :a => (x -> uppercase.(x)) => :A)
5×2 DataFrame
Row │ a A
│ String String
─────┼────────────────
1 │ a A
2 │ b B
3 │ c C
4 │ d D
5 │ e E
In this case a more standard way to write it would be:
julia> transform(df_1, :a => ByRow(uppercase) => :A)
5×2 DataFrame
Row │ a A
│ String String
─────┼────────────────
1 │ a A
2 │ b B
3 │ c C
4 │ d D
5 │ e E
or even:
julia> transform(df_1, :a => ByRow(uppercase) => uppercase)
5×2 DataFrame
Row │ a A
│ String String
─────┼────────────────
1 │ a A
2 │ b B
3 │ c C
4 │ d D
5 │ e E
The last form is new in DataFrames.jl 1.3, which allows you to pass a function as a destination column name specifier (in this case the transformation was to uppercase the source column name). Of course in this case it is longer, but it is sometimes useful if you define transformations programmatically.
Upvotes: 4