Ferenc Kagan
Ferenc Kagan

Reputation: 11

Reformatting text into table in R

I would like to kindly ask for the help of the community in reshaping a text file. The text file looks like this:

TRINITY_GG_17866_c6_g1_i1
TRINITY_GG_17866_c3_g1_i1
TRINITY_GG_17866_c1_g1_i7
GO:0000226
GO:0006139
GO:0006259
TRINITY_GG_17866_c5_g1_i1
GO:0003674
GO:0005488

What I would like to get in the end is like this (separated by tab)

TRINITY_GG_17866_c1_g1_i7 GO:0000226
TRINITY_GG_17866_c1_g1_i7 GO:0006139
TRINITY_GG_17866_c1_g1_i7 GO:0006259
TRINITY_GG_17866_c5_g1_i1 GO:0003674
TRINITY_GG_17866_c5_g1_i1 GO:0005488

I could not come up with any solutions so far on how to do this. I would really appreciate any advice on this issue.

Best wishes, Ferenc

Upvotes: 1

Views: 26

Answers (1)

tmfmnk
tmfmnk

Reputation: 39858

One dplyr option could be:

df %>%
 group_by(grp = cumsum(!startsWith(V1, "GO:"))) %>%
 filter(n() > 1) %>%
 mutate(V2 = lead(V1),
        V1 = first(V1)) %>%
 na.omit() %>%
 ungroup() %>%
 select(-grp)

  V1                        V2        
  <chr>                     <chr>     
1 TRINITY_GG_17866_c1_g1_i7 GO:0000226
2 TRINITY_GG_17866_c1_g1_i7 GO:0006139
3 TRINITY_GG_17866_c1_g1_i7 GO:0006259
4 TRINITY_GG_17866_c5_g1_i1 GO:0003674
5 TRINITY_GG_17866_c5_g1_i1 GO:0005488

Or as one column:

df %>%
 group_by(grp = cumsum(!startsWith(V1, "GO:"))) %>%
 filter(n() > 1) %>%
 mutate(V2 = lead(V1),
        V1 = first(V1)) %>%
 na.omit() %>%
 ungroup() %>%
 select(-grp) %>%
 transmute(V1 = paste(V1, V2))

  V1                                  
  <chr>                               
1 TRINITY_GG_17866_c1_g1_i7 GO:0000226
2 TRINITY_GG_17866_c1_g1_i7 GO:0006139
3 TRINITY_GG_17866_c1_g1_i7 GO:0006259
4 TRINITY_GG_17866_c5_g1_i1 GO:0003674
5 TRINITY_GG_17866_c5_g1_i1 GO:0005488

Sample data:

df <- read.table(text = "TRINITY_GG_17866_c6_g1_i1
TRINITY_GG_17866_c3_g1_i1
TRINITY_GG_17866_c1_g1_i7
GO:0000226
GO:0006139
GO:0006259
TRINITY_GG_17866_c5_g1_i1
GO:0003674
GO:0005488",
                 header = FALSE,
                 stringsAsFactors = FALSE)

Upvotes: 1

Related Questions