Harpal
Harpal

Reputation: 12587

For loop behaviour in R

I'm trying to use a for loop and an if statement in my R code, as below.

It produces a density plot, and I am then trying to add a vertical line to the density which is coloured by species. 

When I print(organism) within the for loop it prints the Species column 4 times, why does it not print it out once? So therefore is it cycling 4 times through my data?

The code manages to add some red lines to the density plot, but why is it not adding the remaining coloured lines for the other species?

dat <- structure(list(pdb = structure(1:13, .Label = c("1akk.pdb", "1fi7.pdb", 
"1fi9.pdb", "1giw.pdb", "1hrc.pdb", "1i5t.pdb", "1j3s0.10.pdb", 
"1j3s0.11.pdb", "1j3s0.12.pdb", "1j3s0.13.pdb", "1j3s0.14.pdb", 
"2aiu.pdb", "2b4z.pdb"), class = "factor"), PA = c(1128, 1143, 
1119, 1130, 1055, 1112, 1120, 1121, 1135, 1102, 1121, 1037, 1179
), EHSS = c(1424, 1439, 1404, 1423, 1318, 1403, 1412, 1415, 1432, 
1391, 1413, 1299, 1441), Species = structure(c(2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 1L), .Label = c("BOSTAURUS", 
"EQUUSCABALLUS", "HOMOSAPIENS", "MUSMUSCULUS"), class = "factor")), .Names = c("pdb", 
"PA", "EHSS", "Species"), class = "data.frame", row.names = c(NA, 
-13L))

den.PA <- density(dat$PA)
plot(den.PA)

for (i in 1:length(dat)){
    lineat = dat$PA[i]
    organism = dat$Species[i]
    lineheight <- den.PA$y[which.min(abs(den.PA$x - lineat))]
    print (organism)
    if (organism == 'EQUUSCABALLUS'){
        col = 'red'
    }
    if (organism == 'HOMOSAPIENS'){
        col = 'blue'
    }
    if (organism == 'MUSMUSCULUS'){
        col = 'green'
    }
    if (organism == 'BOSTAURUS'){
        col = 'purple'
    }
    lines(c(lineat, lineat), c(0, lineheight), col = col)
}

Upvotes: 1

Views: 217

Answers (3)

Carl Witthoft
Carl Witthoft

Reputation: 21532

Side answer: you'll get smoother code if you replace all the if with a single switch construction.

From the help page:

 centre <- function(x, type) {
     + switch(type,
     +        mean = mean(x),
     +        median = median(x),
     +        trimmed = mean(x, trim = .1))
     + }

One of the nice things here is that you can put as much as you want on the right-hand side of any case. e.g. in the example I posted, you could do:
+ mean = {cat"this is the mean"; y=mean(x)*sin(z);plot(z,y)}

as a silly example.

Upvotes: 1

Aaron - mostly inactive
Aaron - mostly inactive

Reputation: 37824

For a more "R-ish" solution, try using match, approx, and points with the histogram type.

den.PA <- density(dat$PA)
cols <- data.frame(Species=c('EQUUSCABALLUS', 'HOMOSAPIENS', 'MUSMUSCULUS', 'BOSTAURUS'),
                   col=c('red', 'blue', 'green', 'purple'), stringsAsFactors=FALSE)
plot(den.PA)
points(approx(den.PA$x, den.PA$y, dat$PA), type="h", 
       col=cols$col[match(dat$Species, cols$Species)])

Upvotes: 2

tim riffe
tim riffe

Reputation: 5691

taking your dat from dput(), you can change your code to iterate over the rows, and first convert the the PA to class character before extracting the element. That way you'll get 13 lines:

    den.PA <- density(dat$PA)
    plot(den.PA)

    for (i in 1:nrow(dat)){
        lineat <- dat$PA[i]
        organism <- as.character(dat$Species)[i]
        lineheight <- den.PA$y[which.min(abs(den.PA$x - lineat))]
        print (organism)
        if (organism == 'EQUUSCABALLUS'){
            col <- 'red'
        }
        if (organism == 'HOMOSAPIENS'){
            col <- 'blue'
        }
        if (organism == 'MUSMUSCULUS'){
            col <- 'green'
        }
        if (organism == 'BOSTAURUS'){
            col <- 'purple'
        }
        segments(lineat,0,lineat,lineheight,col=col)
    }

enter image description here

also, I changed lines() to segments(), since you just have 2 points. That doesn't make much difference however. I also changed declarations using = to <-, which hurts the eyes of most R users less.

Upvotes: 1

Related Questions