Reputation: 87
I have a list of emails that basically I want to clean. I want to state that if the '@' character is not in the specific email I want to remove that email - that way an input like 'mywebsite.com' will be removed.
My code is as follows:
email_clean <- function(email, invalid = NA){
email <- trimws(email) # Removes whitespace
email[(nchar(email) %in% c(1,2)) ] <- invalid # Removes emails with 1 or 2 character length
bad_email <- c("\\@no.com", "\\@na.com","\\@none.com","\\@email.com", # List of bad emails - modify to the
"\\@noemail.com", "\\@test.com", # specifications of the request
pattern = paste0("(?i)\\b",paste0(bad_email,collapse="\\b|\\b"),"\\b") # Deletes names matching bad email
email <-gsub(pattern, invalid, sapply(email,as.character))
unname(email)
}
## Define vector of SSN from origianl csv column
Cleaned_Email <- email_clean(my_data$Email)
## Binds cleaned phone to csv
my_data<-cbind(my_data,Cleaned_Email)
Thanks!!
Upvotes: 0
Views: 998
Reputation: 28441
email_clean <- function(email, invalid = NA){
email <- trimws(email) # Removes whitespace
email[(nchar(email) %in% c(1,2)) ] <- invalid # Removes emails with 1 or 2 character length
email[!grepl("@", email)] <- invalid # <------------------ New line added here ------------
bad_email <- c("\\@no.com", "\\@na.com","\\@none.com","\\@email.com", # List of bad emails - modify to the
"\\@noemail.com", "\\@test.com", # specifications of the request
pattern = paste0("(?i)\\b",paste0(bad_email,collapse="\\b|\\b"),"\\b") # Deletes names matching bad email
email <-gsub(pattern, invalid, sapply(email,as.character))
unname(email)
}
Upvotes: 3
Reputation: 10483
Try this to exclude any rows in my_data that don't have '@' sign in the Email column:
my_data <- my_data[grep('@', my_data$Email), ]
Upvotes: 0