Reputation: 127
I want to change the numbers of the data frame to random values (numbers).
x = c("010-1234-5678",
"John 010-8888-8888",
"Phone: 010-1111-2222",
"Peter 018.1111.3333",
"Year(2007,2019,2020)",
"Alice 01077776666")
df = data.frame(
phoneNumber = x
)
For example,
John 836-3816-9361
is the output (random number) that I want, and I want to change other numbers related with Phone
, Peter
, and so on.
I just type random <- sample(1:9,1)
, but I do not know the next step.
Upvotes: 1
Views: 755
Reputation: 8844
You can try this base R approach. We would extract the phone numbers, mask them with 11 random digits, and then put them back to their original places. Note that this method will not change any digits in "Year(2007,2019,2020)"
, if this behavior is what you want.
rand_mask <- function(x) {
m1 <- regexpr("\\b\\d{3}([-.]?)\\d{4}\\1\\d{4}\\b", x)
phones <- regmatches(x, m1)
m2 <- gregexpr("\\d", phones)
rand <- replicate(length(m2), sample.int(10L, 11L, replace = TRUE) - 1L, simplify = FALSE)
regmatches(phones, m2) <- rand
regmatches(x, m1) <- phones
x
}
Test
> set.seed(1234L)
> rand_mask(x)
[1] "954-8453-1659" "John 537-3347-3723" "Phone: 941-7326-8253" "Peter 791.4505.7250" "Year(2007,2019,2020)"
[6] "Alice 08790795282"
> rand_mask(x)
[1] "589-6578-2214" "John 796-5386-2547" "Phone: 036-2830-5595" "Peter 440.1591.4329" "Year(2007,2019,2020)"
[6] "Alice 34955828570"
> rand_mask(x)
[1] "815-7526-4562" "John 268-7295-6144" "Phone: 827-2728-3732" "Peter 960.8214.9580" "Year(2007,2019,2020)"
[6] "Alice 28834025451"
Update
This one replaces all digits with random numbers.
rand_mask2 <- function(x) {
m <- gregexpr("\\d", x)
regmatches(x, m) <- lapply(lengths(m), \(n) sample.int(10L, n, replace = TRUE) - 1L)
x
}
Test
> set.seed(1234L)
> rand_mask2(x)
[1] "954-8453-1659" "John 537-3347-3723" "Phone: 941-7326-8253" "Peter 791.4505.7250" "Year(0879,0795,2825)"
[6] "Alice 89657822147"
> rand_mask2(x)
[1] "965-3862-5470" "John 362-8305-5954" "Phone: 401-5914-3293" "Peter 495.5828.5708" "Year(1575,2645,6226)"
[6] "Alice 87295614482"
> rand_mask2(x)
[1] "727-2837-3296" "John 082-1495-8028" "Phone: 834-0254-5121" "Peter 656.7306.9729" "Year(1707,7594,3930)"
[6] "Alice 55233202082"
Upvotes: 1
Reputation: 4419
If one wants to replace all numbers, the gsubfn
package can be useful.
library(gsubfn)
gsubfn("[0-9]", \(x) sample(0:9, 1), df$phoneNumber)
This replaces all digits with a random number. Naturally, this approach works better if the data is tidy, here for example one could raise some objection to having names, years and phone numbers mixed in one column called phoneNumber
.
Upvotes: 2
Reputation: 34441
You can use stringr::str_replace_all()
which can apply functions to regex matches.
library(dplyr)
library(stringr)
set.seed(5)
df %>%
mutate(res = str_replace_all(x, "\\d+", \(x) str_pad(sample(10 ^ (nc <- nchar(x)), 1) - 1, nc, pad = "0")))
phoneNumber res
1 010-1234-5678 888-6858-2254
2 John 010-8888-8888 John 221-3796-1832
3 Phone: 010-1111-2222 Phone: 402-1526-7238
4 Peter 018.1111.3333 Peter 825.3877.4482
5 Year(2007,2019,2020) Year(3599,9035,9012)
6 Alice 01077776666 Alice 90000945625
Upvotes: 3