Reputation: 3
I am new to R and to programming in general and am looking for feedback on how to approach what is probably a fairly simple problem in R.
I have the following dataset:
df <- data.frame(county = rep(c("QU","AN","GY"), 3),
park = (c("Downtown","Queens", "Oakville","Squirreltown",
"Pinhurst", "GarbagePile","LottaTrees","BigHill",
"Jaynestown")),
hectares = c(12,42,6,18,92,6,4,52,12))
df<-transform(df, parkrank = ave(hectares, county,
FUN = function(x) rank(x, ties.method = "first")))
Which returns a dataframe looking like this:
county park hectares parkrank
1 QU Downtown 12 2
2 AN Queens 42 1
3 GY Oakville 6 1
4 QU Squirreltown 18 3
5 AN Pinhurst 92 3
6 GY GarbagePile 6 2
7 QU LottaTrees 4 1
8 AN BigHill 52 2
9 GY Jaynestown 12 3
I want to use this to create a two-column data frame that lists each county and the park name corresponding to a specific rank (e.g. if when I call my function I add "2" as a variable, shows the second biggest park in each county).
I am very new to R and programming and have spent hours looking over the built in R help files and similar questions here on stack overflow but I am clearly missing something. Can anyone give a simple example of where to begin? It seems like I should be using split then lapply or maybe tapply, but everything I try leaves me very confused :(
Thanks.
Upvotes: 0
Views: 995
Reputation: 1503
Try,
df2 <- function(A,x) {
# A is the name of the data.frame() and x is the rank No
df <- A[A[,4]==x,]
return(df)
}
> df2(df,2)
county park hectares parkrank
1 QU Downtown 12 2
6 GY GarbagePile 6 2
8 AN BigHill 52 2
Upvotes: 1