Learner
Learner

Reputation: 757

how to extract a data from list consecutively

I have a matrix like this

df1 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df2 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df3 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df4 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df5 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df6 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df7 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df8 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df9 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df10 <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))
df <- rbind(df1,df2,df3,df4,df5,df6,df7,df8,df9,df10)

I have a vector like this

dft <- sample(seq(1,10), size=100, replace=TRUE, prob=c(.02,.01,.01,.01,.01,.01,.005,.005,.01,.01))

Then I perform my test on the data like this

t<- sapply(1:nrow(df), function(i) ks.test(as.vector(df[i,]), as.vector(dft)))

I have a list file named t which gives me D values and p.values, I want to extract them and plot them when they are over 100. is there a way to do this instead going to each of them one by one ? The structure of the list is shown below with str(t)

List of 50
 $ : Named num 0.09
  ..- attr(*, "names")= chr "D"
 $ : num 0.813
 $ : chr "two-sided"
 $ : chr "Two-sample Kolmogorov-Smirnov test"
 $ : chr "as.vector(df[i, ]) and as.vector(dft)"
 $ : Named num 0.11
  ..- attr(*, "names")= chr "D"
 $ : num 0.581
 $ : chr "two-sided"
 $ : chr "Two-sample Kolmogorov-Smirnov test"
 $ : chr "as.vector(df[i, ]) and as.vector(dft)"
 $ : Named num 0.09
  ..- attr(*, "names")= chr "D"

I can see that the length of my list is

length(t)
[1] 377930

I want to extract every two data and leave out the rest in a data frame.

I do like this manually

c(t[[1]],t[[2]])
c(t[[6]],t[[7]])
c(t[[11]],t[[12]])
c(t[[21]],t[[22]])
c(t[[26]],t[[27]])
c(t[[31]],t[[32]])
c(t[[36]],t[[37]])

Is there a better way to extract the data from a list like above?

I tried to do that using the following too without any success

result<- data.frame(matrix(NA, nrow = length(t), ncol = 1))
m <- seq(1,length(t),by=5)
for (i in seq_along(m)){
  result[[i]] = c(t[[i]]) 
  if ( i*2 > length(t) ){
    break
  }
}

Upvotes: 1

Views: 48

Answers (1)

divibisan
divibisan

Reputation: 12155

The structure of t is a repeating pattern with a set length, we can work with it much easier if we turn it into a matrix:

t_matrix <- matrix(t, ncol=5, byrow=T)

t_matrix
      [,1] [,2]      [,3]        [,4]                                 [,5]                                   
 [1,] 0.11 0.5806178 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [2,] 0.08 0.9062064 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [3,] 0.11 0.5806178 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [4,] 0.08 0.9062064 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [5,] 0.04 0.9999982 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [6,] 0.05 0.9996333 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [7,] 0.15 0.2105516 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [8,] 0.08 0.9062064 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
 [9,] 0.08 0.9062064 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"
[10,] 0.1  0.6993742 "two-sided" "Two-sample Kolmogorov-Smirnov test" "as.vector(df[i, ]) and as.vector(dft)"

By specifying byrow=T, R will load the data into the 5 column matrix by row, rather than by column as is the default. Now that you have a matrix, you can just subset it as you would any other matrix or dataframe:

t_matrix[,c(1,2)]
      [,1] [,2]     
 [1,] 0.11 0.5806178
 [2,] 0.08 0.9062064
 [3,] 0.11 0.5806178
 [4,] 0.08 0.9062064
 [5,] 0.04 0.9999982
 [6,] 0.05 0.9996333
 [7,] 0.15 0.2105516
 [8,] 0.08 0.9062064
 [9,] 0.08 0.9062064
[10,] 0.1  0.6993742

Upvotes: 2

Related Questions