Inho Lee
Inho Lee

Reputation: 127

How to replace number to random value in R?

I want to change the numbers of the data frame to random values (numbers).

x = c("010-1234-5678",
          "John 010-8888-8888",
          "Phone: 010-1111-2222",
          "Peter 018.1111.3333",
          "Year(2007,2019,2020)",
          "Alice 01077776666")
    
    df = data.frame(
      phoneNumber = x
    )

For example, John 836-3816-9361 is the output (random number) that I want, and I want to change other numbers related with Phone, Peter, and so on.

I just type random <- sample(1:9,1), but I do not know the next step.

Upvotes: 1

Views: 755

Answers (3)

ekoam
ekoam

Reputation: 8844

You can try this base R approach. We would extract the phone numbers, mask them with 11 random digits, and then put them back to their original places. Note that this method will not change any digits in "Year(2007,2019,2020)", if this behavior is what you want.

rand_mask <- function(x) {
  m1 <- regexpr("\\b\\d{3}([-.]?)\\d{4}\\1\\d{4}\\b", x)
  phones <- regmatches(x, m1)
  m2 <- gregexpr("\\d", phones)
  rand <- replicate(length(m2), sample.int(10L, 11L, replace = TRUE) - 1L, simplify = FALSE)
  regmatches(phones, m2) <- rand
  regmatches(x, m1) <- phones
  x
}

Test

> set.seed(1234L)
> rand_mask(x)
[1] "954-8453-1659"        "John 537-3347-3723"   "Phone: 941-7326-8253" "Peter 791.4505.7250"  "Year(2007,2019,2020)"
[6] "Alice 08790795282"   
> rand_mask(x)
[1] "589-6578-2214"        "John 796-5386-2547"   "Phone: 036-2830-5595" "Peter 440.1591.4329"  "Year(2007,2019,2020)"
[6] "Alice 34955828570"   
> rand_mask(x)
[1] "815-7526-4562"        "John 268-7295-6144"   "Phone: 827-2728-3732" "Peter 960.8214.9580"  "Year(2007,2019,2020)"
[6] "Alice 28834025451"  

Update

This one replaces all digits with random numbers.

rand_mask2 <- function(x) {
  m <- gregexpr("\\d", x)
  regmatches(x, m) <- lapply(lengths(m), \(n) sample.int(10L, n, replace = TRUE) - 1L)
  x
}

Test

> set.seed(1234L)
> rand_mask2(x)
[1] "954-8453-1659"        "John 537-3347-3723"   "Phone: 941-7326-8253" "Peter 791.4505.7250"  "Year(0879,0795,2825)"
[6] "Alice 89657822147"   
> rand_mask2(x)
[1] "965-3862-5470"        "John 362-8305-5954"   "Phone: 401-5914-3293" "Peter 495.5828.5708"  "Year(1575,2645,6226)"
[6] "Alice 87295614482"   
> rand_mask2(x)
[1] "727-2837-3296"        "John 082-1495-8028"   "Phone: 834-0254-5121" "Peter 656.7306.9729"  "Year(1707,7594,3930)"
[6] "Alice 55233202082"  

Upvotes: 1

Donald Seinen
Donald Seinen

Reputation: 4419

If one wants to replace all numbers, the gsubfn package can be useful.

library(gsubfn)
gsubfn("[0-9]", \(x) sample(0:9, 1), df$phoneNumber)

This replaces all digits with a random number. Naturally, this approach works better if the data is tidy, here for example one could raise some objection to having names, years and phone numbers mixed in one column called phoneNumber.

Upvotes: 2

lroha
lroha

Reputation: 34441

You can use stringr::str_replace_all() which can apply functions to regex matches.

library(dplyr)
library(stringr)

set.seed(5)

df %>%
  mutate(res = str_replace_all(x, "\\d+", \(x) str_pad(sample(10 ^ (nc <- nchar(x)), 1) - 1, nc, pad = "0")))

           phoneNumber                  res
1        010-1234-5678        888-6858-2254
2   John 010-8888-8888   John 221-3796-1832
3 Phone: 010-1111-2222 Phone: 402-1526-7238
4  Peter 018.1111.3333  Peter 825.3877.4482
5 Year(2007,2019,2020) Year(3599,9035,9012)
6    Alice 01077776666    Alice 90000945625

Upvotes: 3

Related Questions