Reputation: 5669
These four ways of creating a dataframe
look pretty similar to me:
myData1 <- data.frame(a <- c(1,2), b <- c(3, 4))
myData2 <- data.frame(a = c(1,2), b = c(3,4))
myData3 <- data.frame(`<-`(a,c(1,2)), `<-`(b,c(3, 4)))
myData4 <- data.frame(`=`(a,c(1,2)), `=`(b,c(3,4)))
But If I print out the column names, I only get the nice column names that I would hope for if I use the =
operator. In all the other cases, the whole expression becomes the column name, with all the non-alphanumerics replaced by periods:
> colnames(myData1)
[1] "a....c.1..2." "b....c.3..4."
> colnames(myData2)
[1] "a" "b"
> colnames(myData3)
[1] "a....c.1..2." "b....c.3..4."
> colnames(myData4)
[1] "a...c.1..2." "b...c.3..4."
I've read about differences between <-
and =
when used in function calls in terms of variable scope, but as far as I can reason (possibly not very far), that doesn't explain this particular behavior.
=
and <-
?=
?Upvotes: 0
Views: 83
Reputation: 263332
When you offer a <- c(1,2)
as an argument to data.frame, there will be a value for the first argument, but there will be no name in the formals list. The formals of a function are processed with as.list
. Both a
and c(1,2)
were passed to <-
and an element named a
is returned and this results in there being no name in the arguments that got sent to as.list
. You can think of the symbol a
as having already been already processed and therefore "used up". The default names in that situation are the results of a deparse
call.
> make.names(deparse( quote(a <- c(1,2) )) )
[1] "a....c.1..2."
Upvotes: 2
Reputation: 330083
When you call a function, including data.frame
, =
is not used as an assignment operator. It simply marks relationships between given parameter and a variable you pass to the function.
Ignoring data.frame(a = c(1,2), b = c(3,4))
, fore each of these calls <-
and =
are interpreted as normal assignments and create a
and b
variables in your environment.
> ls()
character(0)
> myData1 <- data.frame(a <- c(1,2), b <- c(3, 4))
[1] "a" "b" "myData1"
> rm(list=ls())
> ls()
character(0)
> myData3 <- data.frame(`<-`(a,c(1,2)), `<-`(b,c(3, 4)))
> ls()
[1] "a" "b" "myData3"
> rm(list=ls())
> ls()
character(0)
> myData4 <- data.frame(`=`(a,c(1,2)), `=`(b,c(3,4)))
> ls()
[1] "a" "b" "myData4"
Data frame get expected values only because <-
and =
return invisibly the argument.
> foo <- `=`(a,c(1,2))
> foo
[1] 1 2
Because of that your data.frame
calls are equivalent, ignoring variable assignment side effect, to
> data.frame(c(1,2), c(3, 4))
c.1..2. c.3..4.
1 1 3
2 2 4
hence the results you see.
Upvotes: 2