Reputation: 133
I am developing a prototype on Google cloud platform for which I am using cloud storage, appengine and bigquery.
Now, one of the tasks is to load a file daily from google cloud storage to bigquery for which I am using Cron task on Appengine
The problem is bigquery expects the data to be in the NDJSON format.(new line delimited json) whereas my source file is in normal JSON format.
Currently, I downloaded the file to my laptop and converted it to NDJSOn and then uploaded to bigquery but how do I do it programatically on google clould platform? I am hoping there is something available which I can use as I do not want to write from scratch.
Upvotes: 3
Views: 4147
Reputation: 133
Might be useful to others. This is how I did it but let me know if there's a better or easier way to do it. Need to download Cloud storage java API and dependencies (http client api and oauth api): https://developers.google.com/api-client-library/java/apis/
Need to download JSON parser like jackson.
Steps:
1> Read the json file as inputstream using the java cloud storage API
Storage.Objects.Get getObject = client.objects().get("shiladityabucket", "abc.json");
InputStream input = getObject.executeMediaAsInputStream();
2> Convert into array of Java objects (the json file in my case has multiple records). If it's a single record, no need of the Array.
ObjectMapper mapper = new ObjectMapper();
BillingInfo[] infoArr = mapper.readValue(input, BillingInfo[].class);
3> Create a StorageObject to upload to cloud storage
StorageObject objectMetadata = new StorageObject()
// Set the destination object name
.setName("abc.json")
// Set the access control list to publicly read-only
.setAcl(Arrays.asList(
new ObjectAccessControl().setEntity("allUsers").setRole("READER")));
4> iterate over objects in the array and covert them to json string. Append newline for ndjson.
for (BillingInfo info:infoArr) {
jSonString += mapper.writeValueAsString(info);
jSonString += "\n";
}
5> Create an Inputstream to insert using cloud storage java api
InputStream is = new ByteArrayInputStream(jSonString.getBytes());
InputStreamContent contentStream = new InputStreamContent(null, is);
6> Upload the file
Storage.Objects.Insert insertRequest = client.objects().insert(
"shiladitya001", objectMetadata, contentStream);
insertRequest.execute();
Upvotes: 3