Reputation: 989
I use the following code in order to autosize columns in my spreadsheet:
for (int i = 0; i < columns.size(); i++) {
sheet.autoSizeColumn(i, true);
sheet.setColumnWidth(i, sheet.getColumnWidth(i) + 600);
}
The problem is it takes more than 10 minutes to autosize each column in case of large spreadsheets with more than 3000 rows. It goes very fast for small documents though. Is there anything which could help autosizing to work faster?
Upvotes: 33
Views: 42027
Reputation: 26
The biggest problem of XSSFSheet I encountered is that it doesn't have a funcion to return all values of specific column, you have to iterate all by row, so when you specify column, it actually iterates over all rows and columns. That probably where the slowness comes from
I made my own implementation where it iterates over all columns and rows only onece, and then change the columns specified in listOfColumns
, I also added maxWidth
for cases where you work with long texts, so you don't end up with crazy widths. It would probably be good idea to add implementation for font width as Zz'Rot
have in his implementation
private static void autoSizeColumns(XSSFSheet sheet, Integer maxWidth, List<Integer> listOfColumns) {
HashMap<Integer,Integer> sizesArray = new HashMap<>();
for (Row row : sheet) {
for (Cell cell : row) {
int column = cell.getColumnIndex();
int oldMaxSize = Optional.ofNullable(sizesArray.get(column)).orElse(0);
int newMaxSize = switch (cell.getCellType()) {
case STRING,FORMULA ->
Math.max(oldMaxSize, cell.getStringCellValue().length());
case NUMERIC ->
Math.max(oldMaxSize, Double.toString(cell.getNumericCellValue()).length());
case BOOLEAN ->
Math.max(oldMaxSize, Boolean.toString(cell.getBooleanCellValue()).length());
case ERROR ->
Math.max(oldMaxSize, Byte.toString(cell.getErrorCellValue()).length());
default ->
oldMaxSize;
};
if(oldMaxSize != newMaxSize)
sizesArray.put(column, newMaxSize);
}
}
if(listOfColumns == null){
for (int i = 0; i < sizesArray.size(); i++) {
int width = sizesArray.get(i) * 256;
if (maxWidth != null && width > maxWidth) {
width = maxWidth;
}
sheet.setColumnWidth(i, width);
}
return;
}
for (Integer column : listOfColumns) {
Integer size = sizesArray.get(column);
int width = sizesArray.get(size) * 256;
if (maxWidth != null && width > maxWidth) {
width = maxWidth;
}
sheet.setColumnWidth(column, width);
}
}
Upvotes: 0
Reputation: 69
Unfortunately I don't have enough reputations yet to add comments in answers. So here some annotations:
Row row = sheet.getRow(sheet.getFirstRowNum());
be shure, this row contains at least a value in the last column. Otherwise the cellIterator will end too early, i.e. if a subsequent row has a value in this column, this column will not be autosized. This problem is bypassed if row
contains the headers (names of the columns). Or explicit use a known header row, e.g. int indexOfHeaderRow = ...;
...
Row row = sheet.getRow(indexOfHeaderRow);
sheet.setColumnWidth(i, Math.min(width, MAX_CELL_WIDTH)); // <- set calculated cellwidth
sheet.setColumnWidth(i, Math.max(width, 2048));
Upvotes: 0
Reputation: 1553
The autosizeColumn()
function very slow and unneficient. Even authors of apache POI mentioned in docs, that:
This process can be relatively slow on large sheets, ...
Calculating and setting the cell's width manually is way faster - in my case I reduced the time from ~25,000ms to ~1-5ms.
This is how to achieve it (I was basing on Vladimir Shcherbukhin's answer:
Workbook workbook = new XSSFWorkbook();
Sheet sheet = workbook.createSheet();
final int[] maxNumCharactersInColumns = new int[headers.length]; // maximum number of characters in columns. Necessary to calculate the cell width in most efficient way. sheet.autoSizeColumn(...) is very slow.
Row headersRow = sheet.createRow(0);
CellStyle headerStyle = createHeadersStyle(workbook); // createHeadersStyle() is my own function. Create headers style if you want
for (int i = 0; i < headers.length; i++) { // create headers
Cell headerCell = headersRow.createCell(i, CELL_TYPE_STRING);
headerCell.setCellValue(headers[i]);
headerCell.setCellStyle(headerStyle);
int length = headers[i].length();
if (maxNumCharactersInColumns[i] < length) { // adjust the columns width
maxNumCharactersInColumns[i] = length + 2; // you can add +2 if you have filtering enabled on your headers
}
}
int rowIndex = 1;
for (List<Object> rowValues : rows) {
Row row = sheet.createRow(rowIndex);
int columnIndex = 0;
for (Object value : rowValues) {
Cell cell = createRowCell(row, value, columnIndex); // createRowCell() is my own function.
int length;
if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
String cellValue = cell.getStringCellValue();
// this is quite important part. In some excel spreadsheet you can have a values with line-breaks. It'll be cool to handle that scenario :)
String[] arr = cellValue.split("\n"); // if cell contains complex value with line breaks, calculate only the longest line
length = Arrays.stream(arr).map(String::length).max(Integer::compareTo).get();
} else {
length = value != null ? value.toString().length() : 0;
}
if (maxNumCharactersInColumns[columnIndex] < length) { // if the current cell value is the longest one, save it to an array
maxNumCharactersInColumns[columnIndex] = length;
}
columnIndex++;
}
rowIndex++;
}
for (int i = 0; i < headers.length; i++) {
int width = (int) (maxNumCharactersInColumns[i] * 1.45f) * 256; // 1.45f <- you can change this value
sheet.setColumnWidth(i, Math.min(width, MAX_CELL_WIDTH)); // <- set calculated cell width
}
sheet.setAutoFilter(new CellRangeAddress(0, 0, 0, headers.length - 1));
ByteArrayOutputStream output = new ByteArrayOutputStream();
workbook.write(output);
workbook.close();
Upvotes: 0
Reputation: 29
The autoSizeColumn function itself works not perfect and some columns width not exactly fit the data inside. So, I found some solution that works for me.
sheet.autoSizeColumn(<columnIndex>);
// get autosized column width
int currentColumnWidth = sheet.getColumnWidth(<columnIndex>);
// add custom value to the current width and apply it to column
sheet.setColumnWidth(<columnIndex>, (currentColumnWidth + 2500));
public void autoSizeColumns(Workbook workbook) {
int numberOfSheets = workbook.getNumberOfSheets();
for (int i = 0; i < numberOfSheets; i++) {
Sheet sheet = workbook.getSheetAt(i);
if (sheet.getPhysicalNumberOfRows() > 0) {
Row row = sheet.getRow(sheet.getFirstRowNum());
Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
int columnIndex = cell.getColumnIndex();
sheet.autoSizeColumn(columnIndex);
int currentColumnWidth = sheet.getColumnWidth(columnIndex);
sheet.setColumnWidth(columnIndex, (currentColumnWidth + 2500));
}
}
}
}
P.S. Thanks Ondrej Kvasnovsky for the function https://stackoverflow.com/a/35324693/13087091
Upvotes: 2
Reputation: 989
Solution which worked for me:
It was possible to avoid merged regions, so I could iterate through the other cells and finally autosize to the largest cell like this:
int width = ((int)(maxNumCharacters * 1.14388)) * 256;
sheet.setColumnWidth(i, width);
where 1.14388 is a max character width of the "Serif" font and 256 font units.
Performance of autosizing improved from 10 minutes to 6 seconds.
Upvotes: 55