Youssef Prince
Youssef Prince

Reputation: 1

Output does not include the orginal variable name

When I assign numbers to variable using assign function and then using get function, the output does not include the orginal variable name. Here are the details of the problem:

consider

z=data.frame(x=c(1,2,3))

saving names in data frame.

names=names(z)

data frame only contains one name.

names[1]

Now I assign the values of the first column in data frame to the name in data frame.

assign(names[1],z[,1])

The problem is when I use get function for some test, the output donot include the original name in data frame,

shapiro.test(get(names[1]))

and what I get as output is:

    Shapiro-Wilk normality test

data: get(names[1])

W = 1, p-value = 1

Upvotes: 0

Views: 275

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269431

data.frame

This will create a one column data frame for each column in z

for(nm in names(z)) assign(nm, z[nm])

vector

or either of these this will create a vector for each column in z

for(nm in names(z)) assign(nm, z[[nm]])

list2env(z, .GlobalEnv)

attach

This will create an entry on the search list that contains a vector for each column in z without putting them in the global environment so you can now refer to just x.

attach(z)

You can examine the search list using:

search()

and detach z when finished using:

detach("z")

Subscripting

Note that all of the above normally do not represent good programming practice and it is better to just refer to z$x, z[["x"]] or z["x"] depending on what you want.

with, within, transform

Another thing you can do is use with, within or transform. The first one returns a vector equal to 2 * x and the next two each return a data frame in which x has been doubled.

with(z, 2 * x)

within(z, x <- 2 * x)
transform(z, x = 2 * x)

shapiro.test

This will run shapiro.test on each column of z although the data: line in the output won't look so nice

lapply(z, shapiro.test)

This can alternately be used to force a nicer looking data line (or if you knew there were only one column in z then it could be written my.shapiro.test(names(z), z) ).

my.shapiro.test <- function(nm, z) with(z, do.call("shapiro.test", list(as.name(nm))))
lapply(names(z), my.shapiro.test, z)

or one can use the first approach and then fix up the data names afterwards resulting in shorter code but with the disadvantage that it mucks with the internal structure that shapiro.test returns. (If you knew that z had only one column then it could be written as replace(shapiro.test(z[[1]]), "data.name", names(z))

L <- lapply(z, shapiro.test)
for(nm in names(L)) L[[nm]]$data.name <- nm

The extra complexities associated with shapiro.test are due to the fact that it determines the name using an unevaluated version of the argument. Most functions in R don't do that (although the tidyverse packages are a notable exception) so you won't have such complexities.

Upvotes: 4

Related Questions