Newbie
Newbie

Reputation: 421

Append value in df rows

I have a column in dataframe (df) for which I want to append value (not constant, instead variable). An example will make it more clear:

> df
     geneID Sample.290
1         1  0.4018499
2        10  0.2694255
3       100  1.4441846
4      1000 13.7652753
5     10000  2.1552100
6 100008586  0.2358481

I want to append character "ENSG" and multiple "000" so that total length of the each value will be 15 (including ENSG). For example the output should be:

         geneID           Sample.290
1        ENSG00000000001  0.4018499
2        ENSG00000000010  0.2694255
3        ENSG00000000100  1.4441846
4        ENSG00000001000 13.7652753
5        ENSG00000010000  2.1552100
6        ENSG00100008586  0.2358481

Upvotes: 3

Views: 191

Answers (5)

sorearm
sorearm

Reputation: 409

I would go with Sotos example (it was what I immediately thought when reading your post), str_pad command

Upvotes: 0

989
989

Reputation: 12935

Or you could do (using base R functions):

# df
     # geneID Sample.290
# 1         1  0.4018499
# 2        10  0.2694255
# 3       100  1.4441846
# 4      1000 13.7652753
# 5     10000  2.1552100
# 6 100008586  0.2358481

a="ENSG00000000000"
df[,'geneID']=sapply(1:nrow(df), function(i) 
paste0(substring(a, 1, 15-nchar(df[i,'geneID'])), df[i,'geneID']))

# > df
           # geneID Sample.290
# 1 ENSG00000000001  0.4018499
# 2 ENSG00000000010  0.2694255
# 3 ENSG00000000100  1.4441846
# 4 ENSG00000001000 13.7652753
# 5 ENSG00000010000  2.1552100
# 6 ENSG00100008586  0.2358481

Upvotes: 1

SatishR
SatishR

Reputation: 230

Using basic function:

df$geneID <- sapply(df$geneID,function(x) paste("ENSG",
                    paste(rep(0,(15-nchar(x)-nchar("ENSG"))),collapse = ""),x,sep=""))

"15" total length of variable;

Upvotes: 3

Sotos
Sotos

Reputation: 51592

Using str_pad from stringr,

library(stringr)
df$geneID <- paste0('ENSG', str_pad(df$geneID, width = 11, pad = '0'))
df
#           geneID Sample.290
#1 ENSG00000000001  0.4018499
#2 ENSG00000000010  0.2694255
#3 ENSG00000000100  1.4441846
#4 ENSG00000001000 13.7652753
#5 ENSG00000010000  2.1552100
#6 ENSG00100008586  0.2358481

Upvotes: 5

David_B
David_B

Reputation: 936

The stri_pad_left function in the stringi package will do what you want:

df$geneID <- paste0('ENSG', stringi::stri_pad_left(df[, 'geneID'], width = 11, pad = '0'))

Upvotes: 2

Related Questions