Reputation: 11623
I've made a script that every few hours adds a new row to a Google Apps spreadsheet.
This is the function I've made to find the first empty row:
function getFirstEmptyRow() {
var spr = SpreadsheetApp.getActiveSpreadsheet();
var cell = spr.getRange('a1');
var ct = 0;
while ( cell.offset(ct, 0).getValue() != "" ) {
ct++;
}
return (ct);
}
It works fine, but when reaching about 100 rows, it gets really slow, even ten seconds. I'm worried that when reaching thousands of rows, it will be too slow, maybe going in timeout or worse. Is there a better way?
Upvotes: 67
Views: 137028
Reputation: 2773
combo of DON and Ghoti.
function getLastRowNumber(sheet, columnLabel) {
var columnLabel = sheet.getRange(`${columnLabel}:${columnLabel}`);
var values = columnLabel.getValues(); // get all data in one call
var ct = 0;
for (; values.length > ct && values[ct][0] != ""; ct += 100);
// Step backwards by tens
for ( ; ct > 0 && values[ct][0] == ""; ct -= 10);
// Step forwards by ones
for ( ; values.length > ct && values[ct][0] != ""; ct ++);
return ct;
}
Upvotes: 1
Reputation: 1
this is my very first post on stackOverflow, I hope to meet all your netiquette needs, so please be nice to me.
I think the fastest way to find the first blank cell in a column (I couldn't run the performance checks, anyway) is to let the Google engine do sequential tasks itself; it is simply much more efficient. From a programmer's point of view, this translates into NOT using any kind of iteration/loops, i.e. FOR, WHILE, etc. (By the way, this is the same programming approach on database engines - any activity should NOT use loops to find information.)
The following function does this in just one command (neglecting the var
declarations, here just to improve readability):
function lastCell() {
var workSheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var lastRow = workSheet.getLastRow();
var columnToSearch = 1; //index of the column to search. 1 is 'A'.
workSheet.getRange(lastRow, columnToSearch).activateAsCurrentCell().
getNextDataCell(SpreadsheetApp.Direction.UP).activate();
workSheet.getCurrentCell().offset(1, 0).activate(); // shift one cell down to find a free cell
}
Upvotes: 0
Reputation: 61
I have gone through way too many of these implementations of last-row for a specific column. Many solutions work but are slow for large or multiple datasets. One of my use cases requires me to check the last row in specific columns across multiple spreadsheets. What I have found is that taking the whole column as a range and then iterating through it is too slow, and adding a few of these together makes the script sluggish.
My "hack" has been this formula:
=ROW(index(sheet!A2:A,max(row(sheet!A2:A)*(sheet!A2:A<>""))))-1
Example: Add this to Cell A1, to find the last row in column A. Can be added anywhere, just make sure to manage the "-1" at the end depending on which row the formula is placed. You can also place this is another col, rather than the one you're trying to count, and you don't need to manage the -1. You could also count FROM a starting Row, like "C16:C" - will count values C16 onwards
This formula is reliably giving me the last row, including blanks in the middle of the dataset
To use this value in my GS code, I am simply reading the cell value from A1. I understand that Google is clear that spreadsheet functions like read/write are heavy (time-consuming), but this is much faster than column count last-row methods in my experience (for large datasets)
To make this efficient, I am getting the last row in a col once, then saving it as a global variable and incrementing in my code to track which rows I should be updating. Reading the cell every-time your loop needs to make an update will be too inefficient. Read once, iterate the value, and the A1 cell formula (above) is "storing" the updated value for the next time your function runs
This also works if the data has filters turned on. Actual last row is maintained
Please let me know if this was helpful to you! If I encounter any issues I will comment on this answer.
Upvotes: 2
Reputation: 31300
Here is a list of what the code should do:
This solution uses the array method some
which will stop iterating the loop when the condition is true. This avoids wasting time spent looping through every element of the array, and yet uses an array method rather than a for
or while
loop.
The some
method only returns true or false, but there is a way to capture the index number because the some method halts looping when the condition is true.
The index number is assigned to a variable in the scope outside of the array function. This does not slow down the processing.
Code:
function getFirstEmptyCellIn_A_Column(po) {
var foundEmptyCell,rng,sh,ss,values,x;
/*
po.sheetTabName - The name of the sheet tab to get
po.ssID - the file ID of the spreadsheet
po.getActive - boolean - true - get the active spreadsheet -
*/
/* Ive tested the code for speed using many different ways to do this and using array.some
is the fastest way - when array.some finds the first true statement it stops iterating -
*/
if (po.getActive || ! po.ssID) {
ss = SpreadsheetApp.getActiveSpreadsheet();
} else {
ss = SpreadsheetApp.openById(po.ssID);
}
sh = ss.getSheetByName(po.sheetTabName);
rng = sh.getRange('A:A');//This is the fastest - Its faster than getting the last row and getting a
//specific range that goes only to the last row
values = rng.getValues(); // get all the data in the column - This is a 2D array
x = 0;//Set counter to zero - this is outside of the scope of the array function but still accessible to it
foundEmptyCell = values.some(function(e,i){
//Logger.log(i)
//Logger.log(e[0])
//Logger.log(e[0] == "")
x = i;//Set the value every time - its faster than first testing for a reason to set the value
return e[0] == "";//The first time that this is true it stops looping
});
//Logger.log('x + 1: ' + (x + 1))//x is the index of the value in the array - which is one less than the row number
//Logger.log('foundEmptyCell: ' + foundEmptyCell)
return foundEmptyCell ? x + 1 : false;
}
function testMycode() {
getFirstEmptyCellIn_A_Column({"sheetTabName":"Put Sheet tab name here","ssID":"Put your ss file ID here"})
}
Upvotes: 1
Reputation: 50426
Using indexOf is one of the ways to achieve this:
function firstEmptyRow() { var ss = SpreadsheetApp.getActiveSpreadsheet(); var sh = ss.getActiveSheet(); var rangevalues = sh.getRange(1,1,sh.getLastRow(),1).getValues(); // Column A:A is taken var dat = rangevalues.reduce(function (a,b){ return a.concat(b)},[]); // 2D array is reduced to 1D// // Array.prototype.push.apply might be faster, but unable to get it to work// var fner = 1+dat.indexOf('');//Get indexOf First empty row return(fner); }
Upvotes: 1
Reputation: 1021
Just my two cents, but I do this all the time. I just write the data to the TOP of the sheet. It's date reversed (latest on top), but I can still get it to do what I want. The code below has been storing data it scrapes from a realtor's site for the past three years.
var theSheet = SpreadsheetApp.openById(zSheetId).getSheetByName('Sheet1');
theSheet.insertRowBefore(1).getRange("A2:L2").setValues( [ zPriceData ] );
This chunk of the scraper function inserts a row above #2 and writes the data there. The first row is the header, so I don't touch that. I haven't timed it, but the only time I have an issue is when the site changes.
Upvotes: 2
Reputation: 10849
Finally I got a single line solution for it.
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var lastEmptyOnColumnB = sheet.getRange("B1:B"+sheet.getLastRow()).getValues().join(",").replace(/,,/g, '').split(",").length;
It works fine for me.
Upvotes: 0
Reputation: 23
I tweaked the code ghoti supplied so that it searched for an empty cell. Comparing values did not work on a column with text (or I could not figure out how) instead I used isBlank(). Notice the value is negated with ! (in front of the variable r) when looking forward since you want i to increase until a blank is found. Working up the sheet by ten you want to stop decreasing i when you find a cell that is not blank (! removed). Then, back down the sheet by one to the first blank.
function findRow_() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
ss.setActiveSheet(ss.getSheetByName("DAT Tracking"));
var r = ss.getRange('C:C');
// Step forwards by hundreds
for (var i = 2; !r.getCell(i,1).isBlank(); i += 100) { }
// Step backwards by tens
for ( ; r.getCell(i,1).isBlank(); i -= 10) { }
// Step forwards by ones
for ( ; !r.getCell(i,1).isBlank(); i++) { }
return i;
Upvotes: 2
Reputation: 815
I know this is an old thread and there have been some very clever approaches here.
I use the script
var firstEmptyRow = SpreadsheetApp.getActiveSpreadsheet().getLastRow() + 1;
if I need the first completely empty row.
If I need the first empty cell in a column I do the following.
My 2nd row is a hidden row and each cell has the formula
=COUNTA(A3:A)
Where A
is replaced with the column letter.
My script just reads this value. This updates pretty quickly compared to script approaches.
There is one time this does not work and that is when I allow empty cells to break up the column. I have not needed a fix for this yet, I suspect one may be derived from COUNTIF
, or a combined function or one of the many other inbuilt ones.
EDIT: COUNTA
does cope with blank cells within a range, so the concern about the "one time this does not work" is not really a concern. (This might be a new behavior with "new Sheets".)
Upvotes: 8
Reputation: 46794
Seeing this old post with 5k views I first checked the 'best answer' and was quite surprised by its content... this was a very slow process indeed ! then I felt better when I saw Don Kirkby's answer, the array approach is indeed much more efficient !
But how much more efficient ?
So I wrote this little test code on a spreadsheet with 1000 rows and here are the results : (not bad !... no need to tell which one is which...)
and here is the code I used :
function onOpen() {
var menuEntries = [ {name: "test method 1", functionName: "getFirstEmptyRow"},
{name: "test method 2 (array)", functionName: "getFirstEmptyRowUsingArray"}
];
var sh = SpreadsheetApp.getActiveSpreadsheet();
sh.addMenu("run tests",menuEntries);
}
function getFirstEmptyRow() {
var time = new Date().getTime();
var spr = SpreadsheetApp.getActiveSpreadsheet();
var ran = spr.getRange('A:A');
for (var i= ran.getLastRow(); i>0; i--){
if(ran.getCell(i,1).getValue()){
break;
}
}
Browser.msgBox('lastRow = '+Number(i+1)+' duration = '+Number(new Date().getTime()-time)+' mS');
}
function getFirstEmptyRowUsingArray() {
var time = new Date().getTime();
var sh = SpreadsheetApp.getActiveSpreadsheet();
var ss = sh.getActiveSheet();
var data = ss.getDataRange().getValues();
for(var n =data.length ; n<0 ; n--){
if(data[n][0]!=''){n++;break}
}
Browser.msgBox('lastRow = '+n+' duration = '+Number(new Date().getTime()-time)+' mS');
}
function fillSheet(){
var sh = SpreadsheetApp.getActiveSpreadsheet();
var ss = sh.getActiveSheet();
for(var r=1;r<1000;++r){
ss.appendRow(['filling values',r,'not important']);
}
}
And the test spreadsheet to try it yourself :-)
EDIT :
Following Mogsdad's comment, I should mention that these function names are indeed a bad choice... It should have been something like getLastNonEmptyCellInColumnAWithPlentyOfSpaceBelow()
which is not very elegant (is it ?) but more accurate and coherent with what it actually returns.
Comment :
Anyway, my point was to show the speed of execution of both approaches, and it obviously did it (didn't it ? ;-)
Upvotes: 9
Reputation: 45710
This question has now had more than 12K views - so it's time for an update, as the performance characteristics of New Sheets are different than when Serge ran his initial tests.
Good news: performance is much better across the board!
As in the first test, reading the sheet's data just once, then operating on the array, gave a huge performance benefit. Interestingly, Don's original function performed much better than the modified version that Serge tested. (It appears that while
is faster than for
, which isn't logical.)
The average execution time on the sample data is just 38ms, down from the previous 168ms.
// Don's array approach - checks first column only
// With added stopping condition & correct result.
// From answer https://stackoverflow.com/a/9102463/1677912
function getFirstEmptyRowByColumnArray() {
var spr = SpreadsheetApp.getActiveSpreadsheet();
var column = spr.getRange('A:A');
var values = column.getValues(); // get all data in one call
var ct = 0;
while ( values[ct] && values[ct][0] != "" ) {
ct++;
}
return (ct+1);
}
Here are the results, summarized over 50 iterations in a spreadsheet with 100 rows x 3 columns (filled with Serge's test function).
The function names match the code in the script below.
The original ask was to find the first empty row. None of the previous scripts actually deliver on that. Many check just one column, which means that they can give false positive results. Others only find the first row that follows all data, meaning that empty rows in non-contiguous data get missed.
Here's a function that does meet the spec. It was included in the tests, and while slower than the lightning-fast single-column checker, it came in at a respectable 68ms, a 50% premium for a correct answer!
/**
* Mogsdad's "whole row" checker.
*/
function getFirstEmptyRowWholeRow() {
var sheet = SpreadsheetApp.getActiveSheet();
var range = sheet.getDataRange();
var values = range.getValues();
var row = 0;
for (var row=0; row<values.length; row++) {
if (!values[row].join("")) break;
}
return (row+1);
}
If you want to repeat the tests, or add your own function to the mix as a comparison, just take the whole script and use it in a spreadsheet.
/**
* Set up a menu option for ease of use.
*/
function onOpen() {
var menuEntries = [ {name: "Fill sheet", functionName: "fillSheet"},
{name: "test getFirstEmptyRow", functionName: "testTime"}
];
var sh = SpreadsheetApp.getActiveSpreadsheet();
sh.addMenu("run tests",menuEntries);
}
/**
* Test an array of functions, timing execution of each over multiple iterations.
* Produce stats from the collected data, and present in a "Results" sheet.
*/
function testTime() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
ss.getSheets()[0].activate();
var iterations = parseInt(Browser.inputBox("Enter # of iterations, min 2:")) || 2;
var functions = ["getFirstEmptyRowByOffset", "getFirstEmptyRowByColumnArray", "getFirstEmptyRowByCell","getFirstEmptyRowUsingArray", "getFirstEmptyRowWholeRow"]
var results = [["Iteration"].concat(functions)];
for (var i=1; i<=iterations; i++) {
var row = [i];
for (var fn=0; fn<functions.length; fn++) {
var starttime = new Date().getTime();
eval(functions[fn]+"()");
var endtime = new Date().getTime();
row.push(endtime-starttime);
}
results.push(row);
}
Browser.msgBox('Test complete - see Results sheet');
var resultSheet = SpreadsheetApp.getActive().getSheetByName("Results");
if (!resultSheet) {
resultSheet = SpreadsheetApp.getActive().insertSheet("Results");
}
else {
resultSheet.activate();
resultSheet.clearContents();
}
resultSheet.getRange(1, 1, results.length, results[0].length).setValues(results);
// Add statistical calculations
var row = results.length+1;
var rangeA1 = "B2:B"+results.length;
resultSheet.getRange(row, 1, 3, 1).setValues([["Avg"],["Stddev"],["Trimmed\nMean"]]);
var formulas = resultSheet.getRange(row, 2, 3, 1);
formulas.setFormulas(
[[ "=AVERAGE("+rangeA1+")" ],
[ "=STDEV("+rangeA1+")" ],
[ "=AVERAGEIFS("+rangeA1+","+rangeA1+',"<"&B$'+row+"+3*B$"+(row+1)+","+rangeA1+',">"&B$'+row+"-3*B$"+(row+1)+")" ]]);
formulas.setNumberFormat("##########.");
for (var col=3; col<=results[0].length;col++) {
formulas.copyTo(resultSheet.getRange(row, col))
}
// Format for readability
for (var col=1;col<=results[0].length;col++) {
resultSheet.autoResizeColumn(col)
}
}
// Omiod's original function. Checks first column only
// Modified to give correct result.
// question https://stackoverflow.com/questions/6882104
function getFirstEmptyRowByOffset() {
var spr = SpreadsheetApp.getActiveSpreadsheet();
var cell = spr.getRange('a1');
var ct = 0;
while ( cell.offset(ct, 0).getValue() != "" ) {
ct++;
}
return (ct+1);
}
// Don's array approach - checks first column only.
// With added stopping condition & correct result.
// From answer https://stackoverflow.com/a/9102463/1677912
function getFirstEmptyRowByColumnArray() {
var spr = SpreadsheetApp.getActiveSpreadsheet();
var column = spr.getRange('A:A');
var values = column.getValues(); // get all data in one call
var ct = 0;
while ( values[ct] && values[ct][0] != "" ) {
ct++;
}
return (ct+1);
}
// Serge's getFirstEmptyRow, adapted from Omiod's, but
// using getCell instead of offset. Checks first column only.
// Modified to give correct result.
// From answer https://stackoverflow.com/a/18319032/1677912
function getFirstEmptyRowByCell() {
var spr = SpreadsheetApp.getActiveSpreadsheet();
var ran = spr.getRange('A:A');
var arr = [];
for (var i=1; i<=ran.getLastRow(); i++){
if(!ran.getCell(i,1).getValue()){
break;
}
}
return i;
}
// Serges's adaptation of Don's array answer. Checks first column only.
// Modified to give correct result.
// From answer https://stackoverflow.com/a/18319032/1677912
function getFirstEmptyRowUsingArray() {
var sh = SpreadsheetApp.getActiveSpreadsheet();
var ss = sh.getActiveSheet();
var data = ss.getDataRange().getValues();
for(var n=0; n<data.length ; n++){
if(data[n][0]==''){n++;break}
}
return n+1;
}
/**
* Mogsdad's "whole row" checker.
*/
function getFirstEmptyRowWholeRow() {
var sheet = SpreadsheetApp.getActiveSheet();
var range = sheet.getDataRange();
var values = range.getValues();
var row = 0;
for (var row=0; row<values.length; row++) {
if (!values[row].join("")) break;
}
return (row+1);
}
function fillSheet(){
var sh = SpreadsheetApp.getActiveSpreadsheet();
var ss = sh.getActiveSheet();
for(var r=1;r<1000;++r){
ss.appendRow(['filling values',r,'not important']);
}
}
// Function to test the value returned by each contender.
// Use fillSheet() first, then blank out random rows and
// compare results in debugger.
function compareResults() {
var a = getFirstEmptyRowByOffset(),
b = getFirstEmptyRowByColumnArray(),
c = getFirstEmptyRowByCell(),
d = getFirstEmptyRowUsingArray(),
e = getFirstEmptyRowWholeRow(),
f = getFirstEmptyRowWholeRow2();
debugger;
}
Upvotes: 82
Reputation: 46826
I have a similar issue. Right now it's a table with many hundreds of rows, and I'm expecting it to grow to many many thousands. (I haven't seen whether a Google spreadsheet will handle tens of thousands of rows, but I'll get there eventually.)
Here's what I'm doing.
This depends of course on having contiguous content. Can't have any random blank lines in there. Or at least, if you do, results will be sub-optimal. And you can tune the increments if you think it's important. These work for me, and I find that the difference in duration between steps of 50 and steps of 100 are negligible.
function lastValueRow() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var r = ss.getRange('A1:A');
// Step forwards by hundreds
for (var i = 0; r.getCell(i,1).getValue() > 1; i += 100) { }
// Step backwards by tens
for ( ; r.getCell(i,1).getValue() > 1; i -= 10) { }
// Step forwards by ones
for ( ; r.getCell(i,1).getValue() == 0; i--) { }
return i;
}
This is much faster than inspecting every cell from the top. And if you happen to have some other columns that extend your worksheet, it may be faster than inspecting every cell from the bottom, too.
Upvotes: 2
Reputation: 704
And why don't use appendRow?
var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
spreadsheet.appendRow(['this is in column A', 'column B']);
Upvotes: 6
Reputation: 2111
I keep an extra "maintenance" sheet, on my spreadsheets, where I keep such data.
To get the next free row of a range I just examine the relevant cell. I can get the value instantly, because the work of finding the value happens when the data is changed.
The formula in the cell is usually something like :
=QUERY(someSheet!A10:H5010,
"select min(A) where A > " & A9 & " and B is null and D is null and H < 1")
The value in A9 can be set periodically to some row that is near "enough" to the end.
Caveat : I have never checked if this is viable for huge data sets.
Upvotes: 0
Reputation: 5601
It's already there as the getLastRow method on the Sheet.
var firstEmptyRow = SpreadsheetApp.getActiveSpreadsheet().getLastRow() + 1;
Ref https://developers.google.com/apps-script/class_sheet#getLastRow
Upvotes: 42
Reputation: 577
Indeed the getValues is a good option but you can use the .length function to get the last row.
function getFirstEmptyRow() {
var spr = SpreadsheetApp.getActiveSpreadsheet();
var array = spr.getDataRange().getValues();
ct = array.length + 1
return (ct);
}
Upvotes: 1
Reputation: 56610
The Google Apps Script blog had a post on optimizing spreadsheet operations that talked about batching reads and writes that could really speed things up. I tried your code on a spreadsheet with 100 rows, and it took about seven seconds. By using Range.getValues()
, the batch version takes one second.
function getFirstEmptyRow() {
var spr = SpreadsheetApp.getActiveSpreadsheet();
var column = spr.getRange('A:A');
var values = column.getValues(); // get all data in one call
var ct = 0;
while ( values[ct][0] != "" ) {
ct++;
}
return (ct);
}
If the spreadsheet gets big enough, you might need to grab the data in chunks of 100 or 1000 rows instead of grabbing the entire column.
Upvotes: 58