hubsonline
hubsonline

Reputation: 35

R - Looping over Columns then Rows

I was wondering if anyone could help me with a problem I'm having in R. It involves looping over columns and rows. The example below should be clear hopefully. I have a 5x5 table below. Using row 1 as an example, I would like to count the number of times V2:V5 are lower than the value in V1, and express that as a decimal.

set.seed(1)
data=as.data.frame(replicate(5, rnorm(5)))

      V1         V2         V3          V4          V5
 1 -0.6264538 -0.8204684  1.5117812 -0.04493361  0.91897737
 2  0.1836433  0.4874291  0.3898432 -0.01619026  0.78213630
 3 -0.8356286  0.7383247 -0.6212406  0.94383621  0.07456498
 4  1.5952808  0.5757814 -2.2146999  0.82122120 -1.98935170
 5  0.3295078 -0.3053884  1.1249309  0.59390132  0.61982575


test=lapply(2:5,function(a){
ifelse(data[1,1]<=data[1,a],1,0)})
testtable=(as.data.frame(table(unlist(test)))[1,2])/4
testtable
[1] 0.25

This means that in row 1, only 1/4 values in V2:V5 are lower than V1. I'd like to use an additional loop for this to go through each row separately. I tried:

test2=lapply(2:5,function(a){
lapply(1:5,function(b){
ifelse(original_permuted_results[b,1]<=original_permuted_results[a,b],1,0)
(as.data.frame(table(unlist(test)))[1,2])/4})})

Resulting in

[[1]]
[[1]][[1]]
[1] 0.25

[[1]][[2]]
[1] 0.25

[[1]][[3]]
[1] 0.25

[[1]][[4]]
[1] 0.25

[[1]][[5]]
[1] 0.25


[[2]]
[[2]][[1]]
[1] 0.25

And continues like that, just printing out 0.25 as the result for the remainder of the loops. It should produce, ignoring the words in brackets:

(for row 1) 0.25  
(for row 2) 0.25
(for row 3) 0
(for row 4) 1
(for row 5) 0.25

I had a trawl through the archives but couldn't find anything. My actual data has 300+ rows and 10000 columns, but the output I'm trying to achieve is exactly the same. If anyone has any suggestions that would be very must appreciated. Thanks.

Upvotes: 0

Views: 806

Answers (3)

Ananta
Ananta

Reputation: 3711

does this work,

vec<-rowSums(data<data$V1)/4

> vec
[1] 0.25 0.25 0.00 1.00 0.25

Upvotes: 2

nograpes
nograpes

Reputation: 18323

Very similar to @BrodieG, but perhaps a little clearer:

# Find when each column is less than the first column.
lower.than.first<-sapply(data[2:5],function(x) x<data[,1])
# Calculate the proportion 
num.true<-rowSums(lower.than.first) # TRUE is 1, and FALSE is 0, when summing.
# Get the proportion.
props<-num.true/ncol(lower.than.first)
# [1] 0.25 0.25 0.00 1.00 0.25

Upvotes: 0

BrodieG
BrodieG

Reputation: 52687

You don't need loops. You can take advantage of vectorization:

cat(paste("(for row", 1:nrow(df), ")", 
  rowSums(df[, 1] > df[, 2:5]) / 4),    # this is where it all happens
  sep="\n"
)

Produces:

(for row 1 ) 0.25
(for row 2 ) 0.25
(for row 3 ) 0
(for row 4 ) 1
(for row 5 ) 0.25

Here we take advantage of > coercing the RHS to a matrix in order to do the comparison.

Upvotes: 3

Related Questions