I_AM_JARROD
I_AM_JARROD

Reputation: 685

Replace numbers in first column of a dataframe with NA

I have a dataframe where the first column is always chr, this can't change as it contains a mix of numbers and other text.

I need to find a way to identify instances of a number in the first column and replace with NA.

input dataframe
    example_df_before <- data.frame(
      myNums = c("A","TEXT",1,2,3,4,5,6,9,8,4),
      myChars = c("Adesc","Bdesc","C","Ddes","Ec","F","G",99,12,11,"TEST2"),
      stringsAsFactors = FALSE
    ) 

  myNums myChars
1       A   Adesc
2    TEXT   Bdesc
3       1       C
4       2    Ddes
5       3      Ec
6       4       F
7       5       G
8       6      99
9       9      12
10      8      11
11      4   TEST2

output dataframe
      example_df_after <- data.frame(
      myNums = c("A","TEXT",NA,NA,NA,NA,NA,NA,NA,NA,NA),
      myChars = c("Adesc","Bdesc","C","Ddes","Ec","F","G",99,12,11,"TEST2"),
      stringsAsFactors = FALSE
    ) 

   myNums myChars
1       A   Adesc
2    TEXT   Bdesc
3    <NA>       C
4    <NA>    Ddes
5    <NA>      Ec
6    <NA>       F
7    <NA>       G
8    <NA>      99
9    <NA>      12
10   <NA>      11
11   <NA>   TEST2

Upvotes: 1

Views: 49

Answers (3)

thelatemail
thelatemail

Reputation: 93803

You can also rely on as.numeric to identify numbers by trying to coerce text to numeric. This has the side benefit of recognising values like "1e6", representing 1 million.

as.numeric("1e6")+1
#[1] 1000001

example_df_before$myNums[!is.na(as.numeric(example_df_before$myNums))] <- NA
#   myNums myChars
#1       A   Adesc
#2    TEXT   Bdesc
#3    <NA>       C
#4    <NA>    Ddes
#5    <NA>      Ec
#6    <NA>       F
#7    <NA>       G
#8    <NA>      99
#9    <NA>      12
#10   <NA>      11
#11   <NA>   TEST2

Upvotes: 0

moodymudskipper
moodymudskipper

Reputation: 47300

In base R that would be:

example_df_before$myNums[grepl("^\\d+$",example_df_before$myNums)] <- NA
example_df_before
#    myNums myChars
# 1       A   Adesc
# 2    TEXT   Bdesc
# 3    <NA>       C
# 4    <NA>    Ddes
# 5    <NA>      Ec
# 6    <NA>       F
# 7    <NA>       G
# 8    <NA>      99
# 9    <NA>      12
# 10   <NA>      11
# 11   <NA>   TEST2

Upvotes: 2

akrun
akrun

Reputation: 886938

We replace the 'myNums' by detecting numbers

library(tidyverse)
example_df_before %>% 
  mutate(myNums = replace(myNums, str_detect(myNums, "^\\d+$"), NA)) 
#   myNums myChars
#1       A   Adesc
#2    TEXT   Bdesc
#3    <NA>       C
#4    <NA>    Ddes
#5    <NA>      Ec
#6    <NA>       F
#7    <NA>       G
#8    <NA>      99
#9    <NA>      12
#10   <NA>      11
#11   <NA>   TEST2

Or using base R

is.na(example_df_before$myNums) <- grepl("^\\d+$", example_df_before$myNums)

Upvotes: 0

Related Questions