Reputation: 1223
I have data set (say) test
:
test <- data.frame(x = c(90, 801, 6457, 92727), y = rep("test", 4))
print(test)
x y
1 90 test
2 801 test
3 6457 test
4 92727 test
I want to create variable test$z
that mirrors test$x
, except that test$z
is always 10 characters long, filling in the gaps with zeros. So the resulting data frame would look like:
print(test)
x y z
1 90 test 0000000090
2 801 test 0000000801
3 6457 test 0000006457
4 92727 test 0000092727
I thought that the function below would give me the result I'm looking for:
test$z <- paste0(as.character(rep(0, 10-nchar(as.character(test$x)))), as.character(test$x))
But it kicks back the following error in the rep
function:
Error in rep(0, 10 - nchar(as.character(test$x))) :
invalid 'times' argument
Any ideas of what I could do differently with the rep function or any other solutions to get test$z
?
Upvotes: 3
Views: 1749
Reputation: 99351
In the comments @Roland mentions sprintf()
, which is a great idea. And @m0h3n explained the issue with rep()
in his answer. Here's an alternative to both.
You could replace rep()
with the new base function strrep()
, which will recycle its x
argument the length of times
. It seems to work nicely for your case.
strrep(0, 10 - nchar(test$x))
# [1] "00000000" "0000000" "000000" "00000"
So we just paste that onto the front of test$x
and we're done. No need for any as.character
coercion since it's all done internally.
paste0(strrep(0, 10 - nchar(test$x)), test$x)
# [1] "0000000090" "0000000801" "0000006457" "0000092727"
Note: strrep()
was introduced in R version 3.3.1.
Upvotes: 3
Reputation: 105
You have a couple of good answers so far.
For fun, here's an example of a 'quick-and-dirty' way to do it with functions you likely already know.
test$z <- substr(paste0('0000000000', as.character(test$x)),
nchar(test$x),
10+nchar(test$x))
Just paste more zeroes than you'll need (e.g., 10) to each entry, and substring.
P.S. You could replace the string of zeroes in the above code with a string of length n by instead writing:
paste0(rep(0, n), collapse='')
Upvotes: 2
Reputation: 12935
The problem stems from rep(0, 10-nchar(as.character(test$x)))
where the second argument is a vector which is the times
argument. Basically, this throws an error:
rep(0, c(9, 8, 7, 4))
Instead, you should do:
rep(c(0,0,0,0), c(9, 8, 7, 4))
in which the length of two vectors are the same.
?rep
states that:
If times consists of a single integer, the result consists of the whole input repeated this many times. If times is a vector of the same length as x (after replication by each), the result consists of x[1] repeated times[1] times, x[2] repeated times[2] times and so on.
In our example, x
is c(0,0,0,0)
and times
is c(9, 8, 7, 4)
.
You could do:
test$z <- sapply(test$x, function(x) paste0(paste0(rep(0,10-nchar(x)),collapse = ""),x))
# x y z
#1 90 test 0000000090
#2 801 test 0000000801
#3 6457 test 0000006457
#4 92727 test 0000092727
Upvotes: 4