MT32
MT32

Reputation: 677

Writing a while loop for two sets of data for R

This is probably simple, but Im new to R and it doesn't work like GrADs so I;ve been searching high and low for examples but to no avail..

I have two sets of data. Data A (1997) and Data B (2000)
Data A has 35 headings (apples, orange, grape etc). 200 observations.
Data B has 35 headings (apples, orange, grape, etc). 200 observations.

The only difference between the two datasets is the year.

So i would like to correlate the two dataset i.e. 200 data under Apples (1997) vs 200 data under Apples (2000). So 1 heading should give me only 1 value.

I've converted all the header names to V1,V2,V3...

So now I need to do this:

x<-1

while(x<35) { 

   new(x)=cor(1997$V(x),2000$V(x))

   print(new(x))

}

and then i get this error:

Error in pptn26$V(x) : attempt to apply non-function.

Any advise is highly appreciated!

Upvotes: 0

Views: 57

Answers (1)

De Novo
De Novo

Reputation: 7620

Your error comes directly from using parentheses where R isn't expecting them. You'll get the same type of error if you do 1(x). 1 is not a function, so if you put it right next to parentheses with no white space between, you're attempting to apply a non function.

I'm also a bit surprised at how you are managing to get all the way to that error, before running into several others, but I suppose that has something to do with when R evaluates what...

Here's how to get the behavior you're looking for:

mapply(cor, A, B)
# provided A is the name of your 1997 data frame and B the 2000

Here's an example with simulated data:

set.seed(123)
A <- data.frame(x = 1:10, y = sample(10), z = rnorm(10))
B <- data.frame(x = 4:13, y = sample(10), z = rnorm(10))
mapply(cor, A, B)
#         x          y          z 
# 1.0000000  0.1393939 -0.2402058 

In its typical usage, mapply takes an n-ary function and n objects that provide the n arguments for that function. Here the n-ary function is cor, and the objects are A, and B, each a data frame. A data frame is structured as a list of vectors, the columns of the data frame. So mapply will loop along your columns for you, making 35 calls to cor, each time with the next column of both A and B.

If you have managed to figure out how to name your data frames 1997 and 2000, kudos. It's not easy to do that. It's also going to cause you headaches. You'll want to have a syntactically valid name for your data frame(s). That means they should start with a letter (or a dot, but really a letter). See the R FAQ for the details.

Upvotes: 1

Related Questions