Reputation: 1847
This one is not strictly a problem, but a thing that I encountered by accident. However, it is really intriguing to me.
I've run the following line in my console
sc_matrix <- data.frame(sc_start<-rpois(n=15, 0.4), sc_end<-rpois(n=15, 0.3))
and I was really surprised that the output was
head(sc_matrix, n=5)
# sc_start....rpois.n...15..0.4. sc_end....rpois.n...15..0.3.
#1 0 1
#2 0 2
#3 0 0
#4 1 1
#5 0 0
First, I was surprised because the interpreter understood me (without even a warning). The data.frame
was created even though I have used <-
assignment inside of the data.frame
constructor.
Second, the colnames
seems to be created according to the rule change all non-alpha-numeric into .
(dot) and use it as a name.
After reading the discussion on assignments comparison I guess my question is:
How R handles that line of code? Since there is no =
operator it evaluates each argument, e.g. sc_start<-rpois(n=15, 0.4)
, creates column name from it and uses the value of the right-side evaluation?
It seems tricky, since the operator <-
does not return any value and I would guess the created data.frame
should contain something like NULL
.
I will appreciate any comments on this.
Upvotes: 4
Views: 111
Reputation: 12937
In your example, by
sc_start <- rpois(n=15, 0.4)
you actually assign the result of rpois(n=15, 0.4)
to the variable sc_start
. The same holds for sc_end <- rpois(n=15, 0.3)
.
After creating the data frame, you will notice that those variables are created and placed in your global environment.
What you do is basically the same as
data.frame(rpois(n=15, 0.4), rpois(n=15, 0.3))
in which the column names are not specified explicitly and thus R creates them automatically unless fix.empty.names
is set to FALSE
. The only difference is that you keep the result of each column in a variable. That is, sc_start
and sc_end
.
Check the result of
data.frame(x = sc_start <- rpois(n=15, 0.4), y = sc_end <- rpois(n=15, 0.3))
You will notice that the column names are x
and y
due to =
operator and sc_start
and sc_end
are in your global environment due to <-
operator.
Upvotes: 2
Reputation: 132676
sc_matrix <- data.frame(sc_start<-rpois(n=15, 0.4), sc_end<-rpois(n=15, 0.3))
To understand what happens here, you need to know that like almost everything in R (except data objects) <-
is actually a function. You can even do things like `<-`(a, 1)
. This function has an invisible return value, which is the RHS of the assignment (see help("<-")
), i.e., your assumption is wrong.
If you don't pass column names to data.frame
(as the LHS of =
) it uses substitute
to create names. These names are sanitized if check.names = TRUE
, the default. What you observe is essentially the same as if you do something like data.frame(1)
.
Upvotes: 4