Reputation: 345
I am facing issues with conversion of JSON to dataframe. I tried using libraries: jsonlite
, RJSONIO
,rjson
.
I keep getting 'invalid character in the string' or unclosed string.
I am getting this data from a standard API so should be able to parse this json. Also, JSON editors can parse this data just fine.
My question is:
Is there a standard way using which I can make sure that my dataframe gets created and ignore above errors?
My best guess was to convert this data to JSON format using toJSON
function from either of the libraries but if I use
newdata <- fromJSON(toJSON(data))
it somehow never gets converted to dataframe. Why is that?
If I instead use
newdata <- fromJSON(data)
I get a valid dataframe but sometimes because of above errors, it doesn't work which is what I am trying to know. How do you deal with this?
I have tried using this too freshDeskTicketsToDF <-
jsonlite::fromJSON(paste(readLines(textConnection(freshDeskTickets)), collapse=""))
Are there better ways to deal with this in R?
Also, why is it that using toJSON on data passed to fromJSON never gets converted to a dataframe?
If I decide to take off html tags from the values assisgned to keys in JSON data. How does that work? Can I do that?
Edit: It looks like I get this error when I have <html tags>
in my "string data" but I have them all across my JSON data and I don't get it every time.
How to deal with problems like this?
Note: this issue not specific to the data that I have. What I am looking for is ways to deal with problems like these and not one specific solution to a single problem.
Sample data:
[
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205968,
"responder_id": 1018353725,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Order number-100403891",
"to_emails": [
"[email protected]"
],
"product_id": null,
"id": 174093,
"type": "Order Status query",
"due_by": "2016-09-02T08:57:30Z",
"fr_due_by": "2016-09-02T02:57:30Z",
"is_escalated": true,
"description": "<div dir=\"ltr\">Hi Team,<div><br></div>\n<div>I have ordered an item from your website, order number-100403891. I had called on August 30 2016 to postpone the delivery date. The guy i spoke from your end had confirmed that he will hold and push the delivery date to September 5 or 6 or 7 2016. And he confirmed the same.</div>\n<div>However, the guy I spoke to<b> did not do it</b>. </div>\n<div>I got to know it from ABHINAV from your customer care team who I spoke to on August 1st at 13:10. Hence I have put a request again and he said he will talk to some guys and give me the desired dates for delivery which is 5,6,7 of September 2016. </div>\n<div>Please let me know the concern on this and hope for a quick turn around.</div>\n<div><br></div>\n<div>Thank you,</div>\n<div>Hari,</div>\n<div>+91-9538199699.</div>\n</div>\n",
"description_text": "Hi Team,\r\n\r\nI have ordered an item from your website, order number-100403891. I had\r\ncalled on August 30 2016 to postpone the delivery date. The guy i spoke\r\nfrom your end had confirmed that he will hold and push the delivery date to\r\nSeptember 5 or 6 or 7 2016. And he confirmed the same.\r\nHowever, the guy I spoke to* did not do it*.\r\nI got to know it from ABHINAV from your customer care team who I spoke to\r\non August 1st at 13:10. Hence I have put a request again and he said he\r\nwill talk to some guys and give me the desired dates for delivery which is\r\n5,6,7 of September 2016.\r\nPlease let me know the concern on this and hope for a quick turn around.\r\n\r\nThank you,\r\nHari,\r\n+91-9538199699.\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:51:18Z",
"updated_at": "2016-09-11T11:00:33Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022148025,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Defect in d piece",
"to_emails": [
"[email protected]"
],
"product_id": null,
"id": 174092,
"type": "Return",
"due_by": "2016-09-01T15:51:00Z",
"fr_due_by": "2016-09-01T09:51:00Z",
"is_escalated": false,
"description": "<div><br></div>\n<div><br></div>\n<div><br></div>\n<div><div style=\"font-size:75%;color:#575757\">Sent from Samsung Mobile</div></div>",
"description_text": "\n\n\nSent from Samsung Mobile",
"custom_fields": {
},
"created_at": "2016-09-01T07:51:00Z",
"updated_at": "2016-09-06T09:00:14Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205895,
"responder_id": 1018353725,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Re: StalkBuyLove Return Request for order: 100404435",
"to_emails": [
"StalkBuyLove <[email protected]>"
],
"product_id": null,
"id": 174088,
"type": "Refund query",
"due_by": "2016-09-01T15:43:56Z",
"fr_due_by": "2016-09-01T09:43:56Z",
"is_escalated": true,
"description": "<div>Hi. Can u deposit the amount if i giv u my account number. Right away i cant choose any other product frim ur site. <br><br>Sent from my iPhone</div>\n<div>\n<br>On Sep 1, 2016, at 12:38 PM, StalkBuyLove <<a href=\"mailto:[email protected]\" rel=\"noreferrer\">[email protected]</a>> wrote:<br><br>\n</div>\n<blockquote><div>\n<div><img title=\"StalkBuyLove\" alt=\"Stalkbuylove\" src=\"http://www.stalkbuylove.com/launcher_icons/Newlogo_Stalkbuylove_240x50.png\"></div>\n<div>Hello <b>Anamica Aggarwal</b>,</div>\n<div>We have initiated a return request for order: <b>100404435</b> with the following products:</div>\n<table style=\"width:80%\">\r\n <tbody>\n<tr style=\"background-color:#B0C4DE\">\r\n <th>Item Name</th>\r\n <th>Sku</th>\r\n </tr>\n<tr>\r\n <td style=\"text-align:center\">Articuno Top</td>\r\n <td style=\"text-align:center\">IN1627MTOTOPPCH-198-18</td>\r\n </tr>\n</tbody>\n</table>\n<div>Lots of love,</div>\n<div>Team SBL</div>\n<img src=\"http://mandrillapp.com/track/open.php?u=30069003&id=bff0a5daee4a47fe9c6b04d2680c3c39\" height=\"1\" width=\"1\">\r\n</div></blockquote>",
"description_text": "Hi. Can u deposit the amount if i giv u my account number. Right away i cant choose any other product frim ur site. \n\nSent from my iPhone\n\n> On Sep 1, 2016, at 12:38 PM, StalkBuyLove <[email protected]> wrote:\n> \n> \n> Hello Anamica Aggarwal,\n> \n> We have initiated a return request for order: 100404435 with the following products:\n> \n> Item Name\tSku\n> Articuno Top\tIN1627MTOTOPPCH-198-18\n> Lots of love,\n> \n> Team SBL\n> \n",
"custom_fields": {
},
"created_at": "2016-09-01T07:43:56Z",
"updated_at": "2016-09-11T11:00:32Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205881,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Details for order",
"to_emails": [
"[email protected]"
],
"product_id": null,
"id": 174086,
"type": "Order Status query",
"due_by": "2016-09-01T15:42:50Z",
"fr_due_by": "2016-09-01T09:42:50Z",
"is_escalated": false,
"description": "<div><span></span></div>\n<div>\n<span>Hey can i get details of my order </span><br><span>How much more time will it take to get delivered? </span><br><span>Order no-</span><h2 style=\"font-weight: normal; margin: 0px;\"><font><span style=\"background-color: rgba(255, 255, 255, 0);\">100403837</span></font></h2>\n<span></span><br><span>Sent from my iPhone</span><br>\n</div>",
"description_text": "Hey can i get details of my order \r\nHow much more time will it take to get delivered? \r\nOrder no-\r\n100403837\r\n\r\nSent from my iPhone\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:42:50Z",
"updated_at": "2016-09-06T09:00:13Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": true,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022204690,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Refund",
"to_emails": [
"[email protected]"
],
"product_id": null,
"id": 174080,
"type": "Refund query",
"due_by": "2016-09-01T15:36:26Z",
"fr_due_by": "2016-09-01T09:36:26Z",
"is_escalated": true,
"description": "<div>\r<br>Bank statement as asked for refund! Please intiate the proccedings asap!<br>\n</div>",
"description_text": "\r\nBank statement as asked for refund! Please intiate the proccedings asap!\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:36:26Z",
"updated_at": "2016-09-07T08:00:19Z"
}
]
Upvotes: 4
Views: 895
Reputation: 3696
library(jsonlite)
df <- stream_in(file("~/data/sample.json"))
This stream_in
function directly convert into datafram
Upvotes: 2