Jennifer Collins
Jennifer Collins

Reputation: 253

R: How to replace a character in a string after sampling and print out character instead of index?

I'd like to replace a character in a string with another character, by first sampling by the character. I'm having trouble having it print out the character instead of the index.

Example data, is labelled "try":

L   0.970223325 -   0.019851117 X   0.007444169
K   0.962779156 -   0.027295285 Q   0.004962779
P   0.972704715 -   0.027295285 NA  0
C   0.970223325 -   0.027295285 L   0.00248139
V   0.970223325 -   0.027295285 T   0.00248139

I'm trying to sample a character for a given row using weighted probabilities.

samp <- function(row) {
sample(try[row,seq(1, length(try), 2)], 1, prob = try[row,seq(2, length(try), 2)])
}

Then, I want to use the selected character to replace a position in a given string.

subchar <- function(string, pos, new) {
paste(substr(string, 1, pos-1), new , substr(string, pos+1, nchar(string)), sep='')
}

My question is - if I do, for example

> subchar("KLMN", 3, samp(4))
[1] "KL1N"

But I want it to read "KLCN". As.character(samp(4)) doesn't work either. How do I get it to print out the character instead of the index?

Upvotes: 0

Views: 471

Answers (1)

James
James

Reputation: 66844

The problem arises because your letters are stored as factors rather than characters, and samp is returning a data.frame.

C is the first level in your factor so that is stored as 1 internally, and as.character (which gets invoked by the paste statement) pulls this out when working on the mini-data.frame:

samp(4)
  V1
4  C
as.character(samp(4))
[1] "1"

You can solve this in 2 ways, either dropping the data.frame of the samp output in your call to subchar, or modifying samp to do so:

subchar("KLMN", 3, samp(4)[,1])
[1] "KLCN"

samp2 <- function(row) 
    { sample(try[row,seq(1, length(try), 2)], 1, prob = try[row,seq(2, length(try), 2)])[,1] 
    }

subchar("KLMN",3,samp2(4))
[1] "KLCN

You may also find it easier to sample within your subsetting, and you can drop the data.frame from there:

samp3 <- function(row){
 try[row,sample(seq(1,length(try),2),1,prob=try[row,seq(2,length(try),2)]),drop=TRUE]
 }

Upvotes: 1

Related Questions