Reputation: 4619
I want to export Firestore data to Google Cloud Storage automatically on a regular schedule (so that I can then import it into BigQuery for analysis).
Schedule data exports does outline a way to export data from Firestore on a schedule, but it is JavaScript for running on Node.js: I want to avoid that & would prefer to stick to an all-Java solution on the server side.
Export and import data offers another way — using the gcloud
command-line utility — to export Firestore data to GCS. However, I don't want to schedule the script to run on my laptop & then have to ensure that my laptop is switched on at the right time and has an active Internet connection. I am looking for an entirely App Engine (Standard)-based solution that can be run as a cron
job.
At the time of writing there doesn't seem to be a programmatic way to do this using the Firebase Admin SDK (version 6.6.0) for Java.
Upvotes: 0
Views: 988
Reputation: 4619
The answer lies in utilizing the Firestore REST API directly.
In the code below I have used the Google's HTTP Client Library for Java (which should be your default choice on App Engine (Standard) anyway) for making the necessary network calls.
public static final String DEF_GCS_BUCKET_NAME = PROJECT_ID + ".appspot.com";
public static final String FIRESTORE_API_V1BETA2 =
"https://firestore.googleapis.com/v1beta2";
public static final String FIRESTORE_DB = "/projects/" + PROJECT_ID
+ "/databases/(default)";
public static final String FIRESTORE_EXPORT_GCS_LOC = "gs://"
+ DEF_GCS_BUCKET_NAME + "/firestore-export/";
public static final String FIRESTORE_EXPORT_GCS_ROOT = "firestore-export/";
private static final String FUNC_EXPORT_DOCUMENTS = ":exportDocuments";
@javax.annotation.CheckForNull
public static Operation exportCollectionToGcs(@lombok.NonNull String collection)
throws IOException {
AccessToken token = tokenFor(serviceAc());
Map<String, Object> payload = new HashMap<>();
payload.put("collectionIds", Arrays.asList(collection));
payload.put("outputUriPrefix", FIRESTORE_EXPORT_GCS_LOC + collection);
GenericUrl url = new GenericUrl(FIRESTORE_API_V1BETA2 + FIRESTORE_DATABASE
+ FUNC_EXPORT_DOCUMENTS);
HttpContent content = new JsonHttpContent(jacksonFactory(), payload);
HttpRequest req = requestFactory().buildPostRequest(url, content);
req.getHeaders().setAuthorization("Bearer " + token.getTokenValue())
Operation op = null;
try {
HttpResponse res = req.execute();
// Parse the response JSON to populate an Operation POJO
} catch (HttpResponseException e) {
// Handle the error
}
return op;
}
This starts a Firestore Operation to export the specified collection to GCS. You can then get the status of the Operation if you want to do something when it finishes (or just to send/prepare a report).
Ensure that the service account you use has the requisite permissions (described in Schedule data exports).
Upvotes: 1