goingdeep
goingdeep

Reputation: 121

creating a matrix/dataframe with two for loops in R

This is my first post on SO, so be kind!

My question is vaguely related to this one: Double for loop in R creating a matrix

I want to create a matrix/dataframe and the approach my mind has chosen is to nest two for loops, one to create the first row and the second to repeat it for the rows I nedd.

I could successfully create the first loop, but I can't seem to iterate it for the number of rows I need.

I'm sure that there is a better way to do this, anyway, this is the for loop that gives the result I need for the first row:

x <- character(0)
for(j in 1:18){
    x <- c(x, sum(it_mat[1, 2:26] == j))
}

it_mat is a matrix of 417 rows and 26 columns, where the first column is a string vector with various names and the subsequent columns are randomly generated numbers from 1 to 18.

Here's the first row:

[1,] "Charlie" "14" "3"  "9"  "14" "3"  "9"  "11" "11" "18"  "17"  "16"  "5"   "18"  "6"   "10"  "3"   "9"   "9"   "3"   "18"  "12"  "8"   "5"   "5"  "4"

I want to create a matrix/df where I count how many times, for each name, each number appearead.

The for loop I created above gives me the result I want for the first row:

x
[1] "0" "0" "4" "1" "3" "1" "0" "1" "4" "1" "2" "1" "0" "2" "0" "1" "1" "3"

I really can't iterate it for the subsequent rows with another for loop, there must be something very mundane that I do wrong.

This is my best attempt:

tr_mat <- matrix(, nrow = 147, ncol = 18)
for(i in 1:147){
    x <- character()
    for(j in 1:18){
        x <- c(x, sum(it_mat[i, 2:26] == j))
    }
    tr_mat <- rbind(tr_mat, x)
}

I went on it all afternoon and now I give up and reach out to you, before you give me the correct way to do it, please explain what I'm doing wrong in the nested for loops try, I might learn something.

I hope I explained myself, sorry if I've been too verbose. Thanks for your time.

Upvotes: 3

Views: 472

Answers (3)

Rui Barradas
Rui Barradas

Reputation: 76402

Another way, using base R. Note that *apply functions are loops in disguise.

tr_mat2 <- sapply(1:18, function(j) sapply(1:147, function(i) sum(it_mat[i, 2:26] == j)))

Note that this code will produce a matrix of numbers, while your tr_mat is of mode character:

all.equal(tr_mat, tr_mat2)
#[1] "Modes: character, numeric"

DATA.
This is the dataset generation code that I have used to test the code above.

set.seed(7966)    # make the results reproducible
it_mat <- t(replicate(147, c(sample(letters, 1), sample(18, 25, TRUE))))

EDIT.
Following the suggestion in the comments by MKR, here is the OP's code corrected with my modification in the comment to his (the OP's) post.

tr_mat <- matrix(, nrow = 147, ncol = 18)
for(i in 1:147){
    x <- character()
    for(j in 1:18){
        x <- c(x, sum(it_mat[i, 2:26] == j))
    }
    tr_mat[i, ] <- x
}

This is the code that I have used to produce the matrix tr_mat refered to in the all.equal test above.

Upvotes: 2

MKR
MKR

Reputation: 20085

@RuiBarradesh has pin-pointed the actual problem in OP last attempt. There is another way to fix the OP code using rbind.

# Do not create rows at this place. Let the rows be added with rbind
tr_mat <- matrix(nrow = 0, ncol = 18)   #(, nrow = 147, ncol = 18)
for(i in 1:147){
  x <- character()
  for(j in 1:18){
    x <- c(x, sum(it_mat[i, 2:26] == j))
  }
  tr_mat <- rbind(tr_mat, x)
}

tr_mat      # This will display correct result too

Upvotes: 3

m.ziembinski
m.ziembinski

Reputation: 144

Do you realy need 2 loops? Here is a solution without any loop using data.table and combination of melt/dcast functions:

library(data.table)

# dataset ----------------------------
seed(2018)

it_mat<-data.frame(c1=c('Charlie','John','Peter'))

for(i in 2:26){
  it_mat[,paste0('c',i)]<-sample(1:18,3)
}

# calculation ----------------------------

it_mat<-data.table(it_mat)
it_mat<-melt(it_mat,id.vars='c1')
it_mat[,.N,by=.(c1,value)][order(c1,value)]

dcast(it_mat[,.N,by=.(c1,value)][order(c1,value)],c1~value)

Upvotes: 1

Related Questions