agatha
agatha

Reputation: 1513

Extract vectors from strsplit list without using a loop

Considering the following vector:

[1] "1-1694429" "2-1546669" "3-928598"  "4-834486"  "5-802353"  "6-659439"  "7-552850" 
"8-516804"  "9-364061" 
[10] "10-354181" "11-335154" "12-257915" "13-251310" "14-232313" "15-217628" "16-216569"   

I am trying to generate two vectors, each of them containing the values obtained by splitting each element of the vector by the delimiter "-".

I used:

f <- function(s) strsplit(s, "-")
cc<-sapply(names.reads, f)

head(cc) $1-1694429 [1] "1" "1694429"

$`2-1546669`

[1] "2"       "1546669"

I know I can access them like:

> cc[[1]][1]
[1] "1"

> cc[[1]][2]
[1] "1694429"

I would like to have two vectors , each one containing the values stored at cc[[i]][1] and cc[[i]][2]...Can I do that without using a loop? (I have over 1 million elements )

Upvotes: 21

Views: 39429

Answers (5)

pedrostrusso
pedrostrusso

Reputation: 388

Using sapply() (for completeness' sake):

y <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353", "6-659439", "7-552850", "8-516804", "9-364061", "10-354181", "11-335154", "12-257915", "13-251310", "14-232313", "15-217628", "16-216569")

As @Bird pointed out in the comments, the USE.NAMES parameter can be used to avoid names in the resulting vector.

x <- sapply(y, function(x) strsplit(x, "-")[[1]], USE.NAMES=FALSE)

a <- x[1,]

b <- x[2,]

Upvotes: 10

jtr13
jtr13

Reputation: 1277

Or with the purrr package:

Part 1:

> map(strsplit(names.reads, "-"), ~.x[1]) %>% unlist()
[1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13"
[14] "14" "15" "16"

Part 2:

> map(strsplit(names.reads, "-"), ~.x[2]) %>% unlist()
[1] "1694429" "1546669" "928598"  "834486"  "802353"  "659439" 
[7] "552850"  "516804"  "364061"  "354181"  "335154"  "257915" 
[13] "251310"  "232313"  "217628"  "216569" 

Upvotes: 5

MYaseen208
MYaseen208

Reputation: 23898

Another approach:

names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
              "6-659439",  "7-552850",  "8-516804", "9-364061", "10-354181",
              "11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
              "16-216569")

library(reshape2)
colsplit(string=names.reads, pattern="-", names=c("Part1", "Part2"))

   Part1   Part2
1      1 1694429
2      2 1546669
3      3  928598
4      4  834486
5      5  802353
6      6  659439
7      7  552850
8      8  516804
9      9  364061
10    10  354181
11    11  335154
12    12  257915
13    13  251310
14    14  232313
15    15  217628
16    16  216569

Upvotes: 6

Anto
Anto

Reputation: 1229

Looking to solve a similar problem, came across this post. Adding my solution to this though I am far ahead in the future! (copying from Henry the code)

names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
          "6-659439",  "7-552850",  "8-516804", "9-364061", "10-354181",
          "11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
          "16-216569")

require(plyr)
cc <- ldply(strsplit(names.reads, '-'))
cc$V1;cc$V2

That produces a data frame from which the vectors pertaining to the nth element of each item in the list can be extracted.

Upvotes: 3

Henry
Henry

Reputation: 6784

Using mathematical.coffee's suggestion, the following code avoids loops or sapply

names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
              "6-659439",  "7-552850",  "8-516804", "9-364061", "10-354181",
              "11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
              "16-216569")

cc       <- strsplit(names.reads,'-')
part1    <- unlist(cc)[2*(1:length(names.reads))-1]
part2    <- unlist(cc)[2*(1:length(names.reads))  ]

produces

> part1
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
[16] "16"
> part2
 [1] "1694429" "1546669" "928598"  "834486"  "802353"  "659439"  "552850" 
 [8] "516804"  "364061"  "354181"  "335154"  "257915"  "251310"  "232313" 
[15] "217628"  "216569"

though it does require each original value to be in the expected format.

Upvotes: 23

Related Questions