great s33ker
great s33ker

Reputation: 1

Extracting specific string from an object

I am currently working on a project where I have a list and I need to extract only a specific string from the entire object.

As an example, the data for one of the objects is as follows:

list(content = \"cskarelli haunting photo ritsopi panayiota reacts fire reach home village gouves evia konstant\", meta = list(author = character(0), datetimestamp = list(sec = 42.583606004715, min = 39, hour = 11, mday = 15, mon = 7, year = 121, wday = 0, yday = 226, isdst = 0), description = character(0), heading = character(0), id = character(0), language = character(0), origin = character(0)))"

The aim is to be able to extract only the following string:

cskarelli haunting photo ritsopi panayiota reacts fire reach home village gouves evia konstant

Looking through some documentation and trial / error, using str_subset or str_sub appears to be the most optimal approach, however, does not render the required results.

For example, with using str_subset(string, pattern) I attempted to run the code where the pattern only looks for where the string ends on " and starts with ", but I get the error of NA coersion being introduced.

If anyone has any suggestions or ideas on the best approach for extracting specific data by using a pattern of begins with " and ends with ", that would be great.

Thanks

Upvotes: 0

Views: 85

Answers (1)

r2evans
r2evans

Reputation: 160637

Here's the literal answer to your question:

string <- "list(content = \"cskarelli haunting photo ritsopi panayiota reacts fire reach home village gouves evia konstant\", meta = list(author = character(0), datetimestamp = list(sec = 42.583606004715, min = 39, hour = 11, mday = 15, mon = 7, year = 121, wday = 0, yday = 226, isdst = 0), description = character(0), heading = character(0), id = character(0), language = character(0), origin = character(0)))"
regmatches(string, gregexpr('(?<=")[^"]*(?=")', string, perl = TRUE))
# [[1]]
# [1] "cskarelli haunting photo ritsopi panayiota reacts fire reach home village gouves evia konstant"

Though I'll state again: this is fixing the symptom, not what I believe to be an underlying problem that somehow you (or somebody) had a legitimate R list object that, with some mistaken code, was converted into the string representation of that object, and you are now trying to save data from that.

A different way to extract from this string (since it is an R expression) is to convert it back into an R object:

obj <- eval(parse(text=string))
obj$content
# [1] "cskarelli haunting photo ritsopi panayiota reacts fire reach home village gouves evia konstant"
str(obj)
# List of 2
#  $ content: chr "cskarelli haunting photo ritsopi panayiota reacts fire reach home village gouves evia konstant"
#  $ meta   :List of 7
#   ..$ author       : chr(0) 
#   ..$ datetimestamp:List of 9
#   .. ..$ sec  : num 42.6
#   .. ..$ min  : num 39
#   .. ..$ hour : num 11
#   .. ..$ mday : num 15
#   .. ..$ mon  : num 7
#   .. ..$ year : num 121
#   .. ..$ wday : num 0
#   .. ..$ yday : num 226
#   .. ..$ isdst: num 0
#   ..$ description  : chr(0) 
#   ..$ heading      : chr(0) 
#   ..$ id           : chr(0) 
#   ..$ language     : chr(0) 
#   ..$ origin       : chr(0) 

However, the use of eval(parse(..)) should be done with caution, as it can cause just as many problems as fix symptoms.

Upvotes: 1

Related Questions