Fnguyen
Fnguyen

Reputation: 1177

How write a CSV in R that is correctly read by Google Spreadsheets

Problem

I am trying to write a dataframe to a CSV-file that will be read-in correctly by Google Spreadsheets however I am running into an error with a particular column format.

There is one column called 'details' that has values that look like this {\"campaign_id\":1,\"line_item_id\":1234}. This column format is correctly read by R from an original dataframe supplied from Google Spreadsheets but when written into a CSV, the column is separated along the , into two columns pushing overwriting the values of the following column (which is empty by default).

Data

The dataframe in R looks like this:

data <- structure(
  list(
    `Line Item Id` = c(1234, 4567),
    Details = c(
      "{\"campaign_id\":1,\"line_item_id\":1234}",
      "{\"campaign_id\":1,\"line_item_id\":4567}"
    ),
    `Bid Strategy Type` = c("",""),
    `TrueView Video Ad Formats` = c("In-stream / Video Discovery",
                                    "In-stream / Video Discovery"),
    `TrueView Bid Strategy Type` = c("Manual CPV",
                                     "Manual CPV")
  ),
  row.names = 1:2,
  class = "data.frame"
)

Current approach

I have tried writing the relevant column in a quote:

library(tidyverse)

data %>%
  mutate(Details = dQuote(Details,q = )) %>%
  write.csv("test.csv", fileEncoding = "UTF-8",na = "",row.names = FALSE,quote = FALSE)

But this does not seem to work and neither does omitting the dQuote.

My output csv is this:

test.csv generated by above code

More Details

The data being wrangled here is an SDF generated by DV360 a Google platform to manage YouTube ad campaigns. In my process I download an SDF from DV360 change some values in R and upload it back. However re-uploading does not work at the moment due to the described problem. I have tested it to confirm that the column problem described above is causing the issue and if manually corrected uploading works.

Expected output

I have added the expected output and the output I am getting.

What I have at the moment:

Line Item Id,Details,TrueView Video Ad Formats,TrueView Bid Strategy Type
14596716402,“{"campaign_id":283,"line_item_id":99588}”,In-stream / Video Discovery,
14596725552,“{"campaign_id":283,"line_item_id":99585}”,In-stream / Video Discovery,

What I need:

Line Item Id,Details,TrueView Video Ad Formats,TrueView Bid Strategy Type
1234,"{""campaign_id"":1,""line_item_id"":1234}",,In-stream / Video Discovery
4567,"{""campaign_id"":1,""line_item_id"":4567}",,In-stream / Video Discovery

And quite interestingly, what I get when I fiex the problem by hand in googlesheets and then download the file:

Line Item Id,Details,TrueView Video Ad Formats,TrueView Bid Strategy Type
1234,"""{""""campaign_id"""":1,""""line_item_id"""":1234}""",,In-stream / Video Discovery
4567,"""{""""campaign_id"""":1,""""line_item_id"""":4567}""",,In-stream / Video Discovery

Upvotes: 1

Views: 210

Answers (1)

Fnguyen
Fnguyen

Reputation: 1177

After getting valuable input from @Greg and @MrFlick I was finally able to solve it.

For Google ecosystem (Spreadsheets and Dv360) to correctly read the column it needs to have this format:

"{""campaign_id"":1,""line_item_id"":1234}"

Using dQuote() will put the necessary quotes around the column but due to my system settings, the wrong quote type was supplied. So we need to put off useFancyQuotes.

Additionally the already occuring quotes around campaign_id and line_item_id need to be double-quoted.

Maybe there is a faster way but the following code will work:

library(dplyr) # only needed for pipe, not part of solution

options(useFancyQuotes = FALSE)  

data %>%
  mutate(Details = dQuote(gsub('"','""',Details))) %>%
  write.csv("test3.csv", fileEncoding = "UTF-8",na = "",row.names = FALSE,quote = FALSE)

So we need to first convert all quotes to double Quotes, which I did with gsub() and then use dQuote() to put final quotes around the column making sure not to use fancy, directional quotes.

Upvotes: 1

Related Questions