user2886545
user2886545

Reputation: 713

R. Combinations of letters

I should have an output like this:

AAAA
AAAG
AAAC
AAAT
AAGA
AAGG
...

I've thought to do this first of all with numbers, representing "A" as 1, "G" as 2, etc...

1111
1112
...

And later converting 1 into an "A" and so on. I've found this function expand.grid, but this gives me a data frame with 4 variables (4 columns), each one for a number.

Do you have another idea to do it?

Thanks in advance.

Upvotes: 3

Views: 4080

Answers (3)

IRTFM
IRTFM

Reputation: 263362

After realizing the you just wanted a full "deck" of 4-element combinations/permutations of AGCT (rather than a translation for numers to letters, I think this will be quite a bit faster than an expand.grid approach.:

levs <- paste0( gl(4, 4^3, 4^4, labels=c("A","G","C","T") ), 
                gl(4, 4^2, 4^4, labels=c("A","G","C","T")),
                gl(4, 4,   4^4, labels=c("A","G","C","T")), 
                gl(4, 1,   4^4, labels=c("A","G","C","T")) )

head(levs)
[1] "AAAA" "AAAG" "AAAC" "AAAT" "AAGA" "AAGG"

Upvotes: 0

Matthew Plourde
Matthew Plourde

Reputation: 44614

Edit: My original answer mistakenly assumed you had the vector of indexes already. To generate a vector of all possible combinations of these letters from scratch, try this:

x <- expand.grid(rep(list(c('A', 'G', 'T', 'C')), 4))
do.call(paste0, x)

You can do this with chartr.

x <- c(1111, 1112, 1113, 1114, 1121)
chartr('1234', 'AGCT', x)
# [1] "AAAA" "AAAG" "AAAC" "AAAT" "AAGA"

Upvotes: 9

James Elderfield
James Elderfield

Reputation: 2507

If I understand you right you are able to get all the combinations only the digits are split into different columns. Where do you want your output? If you want to output to a file could you not just do something like:

sink(SOME_FILENAME)

for(i in 1:nrow(YOUR_DATAFRAME))
{
    for(j in 1:ncol(YOUR_DATAFRAME))
    {
        print(YOUR_DATAFRAME[i,j])
    }

    print("\n")
}

Upvotes: 0

Related Questions