Reputation: 11
RULES
{Denny Frying Pan} => {Denny C-Size Batteries}
{Denny Scented Tissue} => {Denny Paper Plates}
{Blue Label Fancy Canned Clams} => {Blue Label Canned Tuna in Water}
{Denny Plastic Forks} => {Golden Frozen Peas}
{Denny Frying Pan} => {Denny D-Size Batteries}
{Denny Plastic Forks} => {Faux Products Apricot Shampoo}
{Golden Frozen Peas} => {Denny Plastic Forks}
{Faux Products Apricot Shampoo} => {Denny Plastic Forks}
{Blue Label Canned Tuna in Water} => {Blue Label Fancy Canned Clams}
{Blue Label Canned String Beans} => {Faux Products Buffered Aspirin}
{Denny D-Size Batteries} => {Denny Frying Pan}
I have a data frame with a single column as above. I want to split the above rules into LHS and RHS
LHS Should contain the Characters which is enclosed between {} before => and similarly RHS should contain Characters enclosed between the next {} which is after the =>
I would like to know how this can be done in R?
Upvotes: 0
Views: 107
Reputation: 109844
Here's an approach with qdapRegex that I maintain:
RULES <- c("{Denny Frying Pan} => {Denny C-Size Batteries}",
"{Denny Scented Tissue} => {Denny Paper Plates}",
"{Blue Label Fancy Canned Clams} => {Blue Label Canned Tuna in Water}",
"{Denny Plastic Forks} => {Golden Frozen Peas}",
"{Denny Frying Pan} => {Denny D-Size Batteries}",
"{Denny Plastic Forks} => {Faux Products Apricot Shampoo}",
"{Golden Frozen Peas} => {Denny Plastic Forks}",
"{Faux Products Apricot Shampoo} => {Denny Plastic Forks}",
"{Blue Label Canned Tuna in Water} => {Blue Label Fancy Canned Clams}",
"{Blue Label Canned String Beans} => {Faux Products Buffered Aspirin}",
"{Denny D-Size Batteries} => {Denny Frying Pan}")
library(qdapRegex)
setNames(do.call(rbind.data.frame, rm_curly(RULES, extract=TRUE)), c("LHS", "RHS"))
## LHS RHS
## 1 Denny Frying Pan Denny C-Size Batteries
## 2 Denny Scented Tissue Denny Paper Plates
## 3 Blue Label Fancy Canned Clams Blue Label Canned Tuna in Water
## 4 Denny Plastic Forks Golden Frozen Peas
## 5 Denny Frying Pan Denny D-Size Batteries
## 6 Denny Plastic Forks Faux Products Apricot Shampoo
## 7 Golden Frozen Peas Denny Plastic Forks
## 8 Faux Products Apricot Shampoo Denny Plastic Forks
## 9 Blue Label Canned Tuna in Water Blue Label Fancy Canned Clams
## 10 Blue Label Canned String Beans Faux Products Buffered Aspirin
## 11 Denny D-Size Batteries Denny Frying Pan
We extract stuff between curly braces and then use do.call
+ rbind.data.frame
to coerce to a data.frame
.
Upvotes: 0
Reputation: 193517
You can try one of the following. Both assume that you're starting with a character vector named "rules". If "rules" is already a column in your data.frame
, you would need some slight modification.
library(splitstackshape)
library(dplyr)
data.table(rules = gsub("[{}]", "", gsub("=>", "\t", rules))) %>%
cSplit("rules", "\t")
# rules_1 rules_2
# 1: Denny Frying Pan Denny C-Size Batteries
# 2: Denny Scented Tissue Denny Paper Plates
# 3: Blue Label Fancy Canned Clams Blue Label Canned Tuna in Water
# 4: Denny Plastic Forks Golden Frozen Peas
# 5: Denny Frying Pan Denny D-Size Batteries
# 6: Denny Plastic Forks Faux Products Apricot Shampoo
# 7: Golden Frozen Peas Denny Plastic Forks
# 8: Faux Products Apricot Shampoo Denny Plastic Forks
# 9: Blue Label Canned Tuna in Water Blue Label Fancy Canned Clams
# 10: Blue Label Canned String Beans Faux Products Buffered Aspirin
# 11: Denny D-Size Batteries Denny Frying Pan
library(dplyr)
library(tidyr)
data.frame(rules) %>%
mutate(rules = gsub("\\s+=>\\s+", "=>", rules)) %>%
mutate(rules = gsub("[{}]", "", rules)) %>%
separate(rules, into = c("V1", "V2"), sep = "=>")
Upvotes: 0
Reputation: 19867
RULES <- c("{Denny Frying Pan} => {Denny C-Size Batteries}",
"{Denny Scented Tissue} => {Denny Paper Plates}",
"{Blue Label Fancy Canned Clams} => {Blue Label Canned Tuna in Water}",
"{Denny Plastic Forks} => {Golden Frozen Peas}",
"{Denny Frying Pan} => {Denny D-Size Batteries}",
"{Denny Plastic Forks} => {Faux Products Apricot Shampoo}",
"{Golden Frozen Peas} => {Denny Plastic Forks}",
"{Faux Products Apricot Shampoo} => {Denny Plastic Forks}",
"{Blue Label Canned Tuna in Water} => {Blue Label Fancy Canned Clams}",
"{Blue Label Canned String Beans} => {Faux Products Buffered Aspirin}",
"{Denny D-Size Batteries} => {Denny Frying Pan}")
df <- as.data.frame(do.call(rbind,strsplit(RULES,"} => {",fixed=TRUE)))
df[,1] <- gsub("{","",df[,1],fixed = TRUE)
df[,2] <- gsub("}","",df[,2],fixed = TRUE)
df
V1 V2
1 Denny Frying Pan Denny C-Size Batteries
2 Denny Scented Tissue Denny Paper Plates
3 Blue Label Fancy Canned Clams Blue Label Canned Tuna in Water
4 Denny Plastic Forks Golden Frozen Peas
5 Denny Frying Pan Denny D-Size Batteries
6 Denny Plastic Forks Faux Products Apricot Shampoo
7 Golden Frozen Peas Denny Plastic Forks
8 Faux Products Apricot Shampoo Denny Plastic Forks
9 Blue Label Canned Tuna in Water Blue Label Fancy Canned Clams
10 Blue Label Canned String Beans Faux Products Buffered Aspirin
11 Denny D-Size Batteries Denny Frying Pan
Upvotes: 1