Derek van Tilborg
Derek van Tilborg

Reputation: 53

move first two characters of a string after a specific character in a string

I want to reformat some genome changes so I can use a certain tool. How can I move the first two characters of a string after a colon in the same string?

For example: g.chr17:7577121G>A must become chr17:g.7577121G>A
g.chr3:52712586T>C must become chr3:g.52712586T>C

There is probably a very straightforward way to do this with gsub an paste, but I can't figure it out.

Upvotes: 3

Views: 736

Answers (3)

akrun
akrun

Reputation: 887481

Here is one without a regex

v1 <- strsplit(input, "[.:]")[[1]]
paste0(v1[2], ":", v1[1], ".", v1[3])
#[1] "chr17:g.7577121G>A"

data

input <- "g.chr17:7577121G>A"

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389105

We can use sub with 3 capture groups

sub("(^.{2})(.*:)(.*)", "\\2\\1\\3", x)
#[1] "chr17:g.7577121G>A" "chr3:g.52712586T>C"

^.{2} - First capture group are first two characters.

.*: - Second capture group is the string till colon.

.* - Third capture group is the remaining string.

and now we arrange these groups in the order 2-1-3.

data

x <- c("g.chr17:7577121G>A", "g.chr3:52712586T>C")

Upvotes: 3

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521914

Try this option:

input <- "g.chr17:7577121G>A"
input <- sub("^([^.]+\\.)([^:]+:)", "\\2\\1", input)
input

[1] "chr17:g.7577121G>A"

The pattern might require some explanation:

^                from the beginning of the input
    ([^.]+\\.)   match and capture any non dot characters up to and including
                 the first dot
    ([^:]+:)     then match and capture any non colon characters up to and
                 including the first colon

Then, we replace with these two captured groups reversed. In this case, the first group is g., and the second group is chr17:. So, the replacement string would then start with chr17:g., followed by whatever was already there.

Upvotes: 3

Related Questions