Reputation: 5601
I note the new Docslist Token and get*ForPaging()
options available now but I am still struggling with an algorithm to process "all files and folders" for arbitrarily large file/folder trees.
Assume a Google Drive based web file system with n files and folders. It will take multiple runs of 6 minutes to get through with a Google Apps Script. Nightly I need to process all files older than 30 days in trees of subfolders beneath a starting folder. I need to process each file once only (but my functions are idempotent so I don't mind if I run against files again).
I have my recursive algo working but the thing that I am missing is a way to have a placeholder so that I don't have to start at the top of the folder tree each time I invoke the script. In six minutes I get through only a few hundred folders and a few thousand files.
My question is what index can I store and how do I start where I left off the next time through?
I have thought about storing Tokens or the last completed folder path "/mytop/sub4/subsub47/" but how would that help me on another invocation? If I started there it would falsely just work down the tree from there and miss siblings and ancestor folders.
I have thought about the "find" methods and using a "before:2012/10..." style search but there's no way to limit that to files in my tree (only a single folder).
I am not pasting my code as it's just standard recursive getFolders/getFiles and not actually relevant to the core of the question.
Upvotes: 2
Views: 2302
Reputation: 17792
I'd create an array of the folders that I have to work on and save it all for a future run. Since you said it's no problem to work on some files/folders repeatedly, you don't even need to put a fake stop to your function. You can let it timeout every time.
Something like this:
var folders = null;
//call this to start the process or set the property manually
function start() {
folders = ['id-of-the-starting-folder'];
work();
}
//set this to run on the trigger
function work() {
if( folders == null )
folders = ScriptProperties.getProperty('folders').split(',');
while( folders.length > 0 ) {
workOnFolder(folders[0]);
folders.shift(); //remove the 1st element
ScriptProperties.setProperty('folders', folders.join());
}
//remove the trigger here
}
function doFolderLater(folder) {
folders.push(folder.getId());
}
function workOnFolder(id) {
var folder = DocsList.getFolderById(id);
folder.getFolders().forEach(doFolderLater);
folder.getFiles().forEach(workOnFile);
}
function workOnFile(file) {
//do your thing
}
Upvotes: 3