bshelt141
bshelt141

Reputation: 1223

rep() Function Using a Variable for 'times' is Throwing an Error

I have data set (say) test:

test <- data.frame(x = c(90, 801, 6457, 92727), y = rep("test", 4))
print(test)
      x    y
1    90 test
2   801 test
3  6457 test
4 92727 test

I want to create variable test$z that mirrors test$x, except that test$z is always 10 characters long, filling in the gaps with zeros. So the resulting data frame would look like:

print(test)
      x    y          z
1    90 test 0000000090
2   801 test 0000000801
3  6457 test 0000006457
4 92727 test 0000092727

I thought that the function below would give me the result I'm looking for:

test$z <- paste0(as.character(rep(0, 10-nchar(as.character(test$x)))), as.character(test$x))

But it kicks back the following error in the rep function:

Error in rep(0, 10 - nchar(as.character(test$x))) :
invalid 'times' argument

Any ideas of what I could do differently with the rep function or any other solutions to get test$z?

Upvotes: 3

Views: 1749

Answers (3)

Rich Scriven
Rich Scriven

Reputation: 99351

In the comments @Roland mentions sprintf(), which is a great idea. And @m0h3n explained the issue with rep() in his answer. Here's an alternative to both.

You could replace rep() with the new base function strrep(), which will recycle its x argument the length of times. It seems to work nicely for your case.

strrep(0, 10 - nchar(test$x))
# [1] "00000000" "0000000"  "000000"   "00000"   

So we just paste that onto the front of test$x and we're done. No need for any as.character coercion since it's all done internally.

paste0(strrep(0, 10 - nchar(test$x)), test$x)
# [1] "0000000090" "0000000801" "0000006457" "0000092727"

Note: strrep() was introduced in R version 3.3.1.

Upvotes: 3

i.Mik
i.Mik

Reputation: 105

You have a couple of good answers so far.

For fun, here's an example of a 'quick-and-dirty' way to do it with functions you likely already know.

test$z <- substr(paste0('0000000000', as.character(test$x)),
                 nchar(test$x),
                 10+nchar(test$x))

Just paste more zeroes than you'll need (e.g., 10) to each entry, and substring.

P.S. You could replace the string of zeroes in the above code with a string of length n by instead writing:

paste0(rep(0, n), collapse='')

Upvotes: 2

989
989

Reputation: 12935

The problem stems from rep(0, 10-nchar(as.character(test$x))) where the second argument is a vector which is the times argument. Basically, this throws an error:

rep(0, c(9, 8, 7, 4))

Instead, you should do:

rep(c(0,0,0,0), c(9, 8, 7, 4))

in which the length of two vectors are the same.

?rep states that:

If times consists of a single integer, the result consists of the whole input repeated this many times. If times is a vector of the same length as x (after replication by each), the result consists of x[1] repeated times[1] times, x[2] repeated times[2] times and so on.

In our example, x is c(0,0,0,0) and times is c(9, 8, 7, 4).

You could do:

test$z <- sapply(test$x, function(x) paste0(paste0(rep(0,10-nchar(x)),collapse = ""),x))

#      x    y          z
#1    90 test 0000000090
#2   801 test 0000000801
#3  6457 test 0000006457
#4 92727 test 0000092727

Upvotes: 4

Related Questions