Reputation: 797
Suppose I am interested in concatenating two variables. I start with a dataset like this:
#what I have
A <- rep(paste("125"),50)
B <- rep(paste("48593"),50)
C <- rep(paste("99"),50)
D <- rep(paste("1233"),50)
one <- append(A,C)
two <- append(B,D)
have <- data.frame(one,two); head(have)
one two
1 125 48593
2 125 48593
3 125 48593
4 125 48593
5 125 48593
6 125 48593
A straightforward paste command does the trick:
#half way there
half <- paste(one,two,sep="-");head(half)
[1] "125-48593" "125-48593" "125-48593" "125-48593" "125-48593" "125-48593"
But I actually want a dataset that looks like this:
#what I desire
E <- rep(paste("00125"),50)
F <- rep(paste("0048593"),50)
G <- rep(paste("00099"),50)
H <- rep(paste("0001233"),50)
three <- append(E,G)
four <- append(F,H)
desire <- data.frame(three,four); head(desire)
three four
1 00125 0048593
2 00125 0048593
3 00125 0048593
4 00125 0048593
5 00125 0048593
6 00125 0048593
So that the straightforward paste command produces this :
#but what I really want
there <- paste(three,four,sep="-");head(there)
[1] "00125-0048593" "00125-0048593" "00125-0048593" "00125-0048593"
[5] "00125-0048593" "00125-0048593"
That is, I want the concatenation to have five digits for the first part and 7 digits for the second part with leading zeros applied when applicable.
Should I first transform the dataset to add the leading zeros and then do the paste command? Or can I do it all within the same line of code? I put a data.table()
tag because I'm sure there is a very efficient solution there that I'm simply not aware of.
test solution provided by @joran:
one <- sprintf("%05s",one)
two <- sprintf("%07s",two)
have <- data.frame(one,two); head(have)
one two
00125 0048593
00125 0048593
00125 0048593
00125 0048593
00125 0048593
00125 0048593
desire <- data.frame(three,four); head(desire)
three four
00125 0048593
00125 0048593
00125 0048593
00125 0048593
00125 0048593
00125 0048593
identical(have$one,desire$three)
[1] TRUE
identical(have$two,desire$four)
[1] TRUE
Upvotes: 1
Views: 1756
Reputation: 60000
Or use paste0
and paste
. paste*
is vectorised so you can do:
half <- paste(paste0("00",one), paste0("00",two) , sep = "-");head(half)
#[1] "00125-0048593" "00125-0048593" "00125-0048593" "00125-0048593"
#[5] "00125-0048593" "00125-0048593"
But you have different string widths. An alternative (sprintf
did not give the same results on my system) would be to paste with more zeros than you know you will need and then trim to the desired length:
one <- paste0("0000000000000000",one)
two <- paste0("0000000000000000",two)
fst <- sapply( one , function(x) substring( x , first = nchar(x)-4 , last = nchar(x) ) )
snd <- sapply( two , function(x) substring( x , first = nchar(x)-6 , last = nchar(x) ) )
half <- paste( fst , snd , sep = "-");head(half)
But I agree this is not a particularly good way of doing things. I'd use sprintf
if I could get that output with character class data! (work with numeric class)
Upvotes: 3
Reputation: 173677
Maybe you are looking for sprintf
:
sprintf("%05d",125)
[1] "00125"
> sprintf("%07d",125)
[1] "0000125"
And if you are padding strings instead of integers, maybe:
sprintf("%07s","125")
[1] "0000125"
Upvotes: 6