Frank B.
Frank B.

Reputation: 1873

Using Characters in String to Create New Variables

Data I'm scrapping off of the web uses the * character to denote one thing and + to denote another.

Here's an example of what it looks like:

# Original Data
original_df <- data.frame(c("Randy Watson*+", "Cleo McDowell*", "Darryl Jenks"))
names(original_df) <- 'nameinfo'

original_df

I want to transform the data to look like this output:

# What I want the Data to look like
name <- c("Randy Watson", "Cleo McDowell", "Darryl Jenks")
this_thing <- c("1", "1", "0")
that_thing <- c("1", "0", "0")
desired_df <- data.frame(name_column, this_thing, that_thing)

desired_df

I basically want to use the prsense of * to denote one flag variable, + for another variable, then remove either * or + from the nameinfo field and use it as a new variable name.

Thanks.

Upvotes: 0

Views: 161

Answers (2)

lawyeR
lawyeR

Reputation: 7654

Here is a different approach, using the character class :punct: and a single gsub call

original_df <- data.frame(c("Randy Watson*+", "Cleo McDowell*", "Darryl Jenks"))
names(original_df) <- 'nameinfo'    
original_df$this_thing <- c("1", "1", "0")
original_df$that_thing <- c("1", "0", "0")
original_df$nameinfo <- gsub("[[:punct:]]", "", original_df$nameinfo)

Upvotes: 0

Tyler Rinker
Tyler Rinker

Reputation: 109844

grepl will work well here:

original_df$this_thing <- grepl("\\*", original_df$nameinfo)
original_df$that_thing <- grepl("\\+", original_df$nameinfo)
original_df$nameinfo <- gsub("\\*|\\+", "", original_df$nameinfo)
original_df

##        nameinfo this_thing that_thing
## 1  Randy Watson       TRUE       TRUE
## 2 Cleo McDowell       TRUE      FALSE
## 3  Darryl Jenks      FALSE      FALSE

Upvotes: 2

Related Questions