Reputation: 427
I have data as below:
[{'cast_id': 17, 'character': 'Albert Einstein', 'credit_id': '52fe43039251416c75000093', 'gender': 2, 'id': 61797, 'name': 'Yahoo Serious', 'order': 0, 'profile_path': '/pe2eKvE4PpdMhmNCIylDjoQhg2o.jpg'}, {'cast_id': 18, 'character': 'Marie Curie', 'credit_id': '52fe43039251416c75000097', 'gender': 0, 'id': 61808, 'name': 'Odile Le Clezio', 'order': 1, 'profile_path': None}, {'cast_id': 19, 'character': 'Mr. Einstein', 'credit_id': '52fe43039251416c7500009b', 'gender': 0, 'id': 61809, 'name': 'Peewee Wilson', 'order': 2, 'profile_path': None}, {'cast_id': 20, 'character': 'Mrs. Einstein', 'credit_id': '52fe43039251416c7500009f', 'gender': 0, 'id': 61810, 'name': 'Su Cruickshank', 'order': 3, 'profile_path': None}, {'cast_id': 21, 'character': 'Preston Preston', 'credit_id': '52fe43039251416c750000a3', 'gender': 2, 'id': 102603, 'name': 'John Howard', 'order': 4, 'profile_path': '/id5ucdglU6oPIibTzrLrtWBxTbw.jpg'}, {'cast_id': 22, 'character': "Darwin's Bodyguard", 'credit_id': '5525ab4692514172760024e2', 'gender': 2, 'id': 1451740, 'name': 'Christian Manon', 'order': 5, 'profile_path': '/800kuPsHOsFpCdHNaiV31xTQcQJ.jpg'}]
I would need to pull all the data that is on the left of 2nd occurrence of character '}' and append a ']' in the end. something like this:
[{'cast_id': 17, 'character': 'Albert Einstein', 'credit_id': '52fe43039251416c75000093', 'gender': 2, 'id': 61797, 'name': 'Yahoo Serious', 'order': 0, 'profile_path': '/pe2eKvE4PpdMhmNCIylDjoQhg2o.jpg'}, {'cast_id': 18, 'character': 'Marie Curie', 'credit_id': '52fe43039251416c75000097', 'gender': 0, 'id': 61808, 'name': 'Odile Le Clezio', 'order': 1, 'profile_path': None}]
I tried a few options using stringr package but couldnt find anything that can get the occurrence of nth position of a particular string and extract the data to its left, next. Any advice would be greatly appreciated.
Upvotes: 0
Views: 3588
Reputation: 1169
You can use gregexpr
to find the location of the n-th occurence of a pattern:
txt <- "[{'cast_id': 17, 'character': 'Albert Einstein'} {'profile_path': None} {etc}]"
loc <- gregexpr("\\}", txt)[[1]][2] # 2 -> second occurence
paste0(substr(txt, 1, loc), "]") # add "]"
# "[{'cast_id': 17, 'character': 'Albert Einstein'} {'profile_path': None}]"
If you have a vector of strings and some may have less than 2 "{", you can use something like
loc <- sapply(gregexpr("\\}", txt), '[', 2).
ifelse(is.na(loc), txt, paste0(substr(txt, 1, loc), "]"))
Upvotes: 2