Reputation: 21
I have a dataframe (df1) scrapped as a single column of data .
1
2 Amazon Pantry
3 Best Sellerin Soaps & Hand Wash
4
5 Palmolive Hygiene-Plus Sensitive Liquid Hand Wash, 300ml
6 Palmolive Hygiene-Plus Sensitive Liquid Hand Wash, 300ml
7 £0.90
8 ?
9
10 Palmolive Naturals Nourishing Liquid Hand Wash, 300ml
11 Palmolive Naturals Nourishing Liquid Hand Wash, 300ml
12 £0.90
13 ?
14
15 L'Oreal Men Expert Carbon Protect Deodorant 250ml
16 L'Oreal Men Expert Carbon Protect Deodorant 250ml
17 £1.50
In order to clean the data i tried using the below commands such that to get Product and pricing information in 2 separate columns . Can someone let me know if there is an alternate way of doing it .
install.packages("splitstackshape")
newdf <- cSplit(df1, "Amazon_Normal_Text2", direction = "long")
Upvotes: 0
Views: 97
Reputation: 937
this is just a thought process...
ml
," extract information until ml
going backward until there is a space and store that into volume variable. (substr
)£
to the end of the string and store that into price variable. (grep
, regex
, nchar
)substr
, nchar
)look into substr
, nchar
, grep
, regex
Upvotes: 0