Harpreet
Harpreet

Reputation: 63

Apache POI: Find out colspan of text overflowing to next column

I'm using Apache POI for converting Excel to HTML. I want the HTML table to be exactly as it appears in Excel. I see that many times there are cases where data in Excel cell overflows to next column. Is there a way to find out how many columns is the data spanning so I can add a colspan attribute to the TD tag? See this screenshot from Excel:

enter image description here

In this case, the data in cell A2 is overflowing to cell B2 (also A3 overflowing to B3). Is there a way to find out that the TD tag corresponding to cell A2 needs colspan="2" attribute?

The cells where the data is overflowing are not merged so I cannot really use functions like sheet.getNumMergedRegions()

I think I can also calculate it if somehow I can find out the "visible" width of columns in Excel. However, sheet.getColumnWidth() only provides the actual width. I don't see a method to find out the "visible" width of a column in Excel. In the screenshot link above the visible width of column A is very small. Is there a way to find that "visible" width?

I'm using Apache POI 3.17

Upvotes: 0

Views: 1455

Answers (1)

Axel Richter
Axel Richter

Reputation: 61870

Apache poi is able autosize the columns according to their contents. So it needs to be able calculating what column width a special content would need. This is what SheetUtil.getCellWidth is doing.

Additionally one needs to know the very special measurement units Microsoft has introduced for column widths in Excel. In Excels GUI a column width of 10 for example means that 10 characters of default character width fits into the cell width. But internally the width is calculated in units of 1/256th of a default character width. That's why apache poi decided to get Sheet.getColumnWidth in units of 1/256th of a character width.

So if you have got a Cell cell having cell index c and having a special content, then using

Workbook workbook...
...
DataFormatter dataFormatter = new DataFormatter();
...
int defaultCharWidth = SheetUtil.getDefaultCharWidth(workbook);
...
double cellValueWidth = SheetUtil.getCellWidth(cell, defaultCharWidth, dataFormatter, false);
int neededColunmnWidth = (int)cellValueWidth*256;
int columnWidth = sheet.getColumnWidth(c);
...

you could determine whether the content fits into the cell. It fits if columnWidth >= neededColunmnWidth, else it fits not and colspan must be used.

Lets have a complete example to show the principle:

Sheet:

enter image description here

Code:

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.SheetUtil;

import java.io.*;

class ExcelToHTMLColspan {

 public static void main(String[] args) throws Exception{

  Workbook workbook = WorkbookFactory.create(new FileInputStream("Test.xlsx"));

  DataFormatter dataFormatter = new DataFormatter();

  int defaultCharWidth = SheetUtil.getDefaultCharWidth(workbook);

  int lastColumnToExport = 5; // column E

  Sheet sheet = workbook.getSheetAt(0);

  Row row;
  Cell cell;
  String cellValue;

  StringBuilder tableHTML = new StringBuilder();

  tableHTML.append("<TABLE>");
  tableHTML.append("<COLGROUP>");
  for (int c = 0; c < lastColumnToExport; c++) {
   long columnWidthPx = Math.round(sheet.getColumnWidthInPixels(c));
   tableHTML.append("<COL width=\"" + columnWidthPx + "\"/>");
  }
  tableHTML.append("</COLGROUP>");

  for (int r = 0; r <= sheet.getLastRowNum(); r++) {
   row = sheet.getRow(r); if (row == null) row = sheet.createRow(r);
   long rowHeightPx = Math.round(row.getHeightInPoints() * 92f / 72f);
   tableHTML.append("<TR height=\"" + rowHeightPx + "\">");
   int c = 0;
   while(c < lastColumnToExport) {
    tableHTML.append("<TD");
    cell = row.getCell(c); if (cell == null) cell = row.createCell(c);
    cellValue = dataFormatter.formatCellValue(cell);
    double cellValueWidth = SheetUtil.getCellWidth(cell, defaultCharWidth, dataFormatter, false);
    int neededColunmnWidth = (int)cellValueWidth*256;
    int columnWidth = sheet.getColumnWidth(c);
    if (columnWidth < neededColunmnWidth) {
     int colSpan = 1;
     while(columnWidth < neededColunmnWidth) {
      colSpan++;
      c++;
      columnWidth += sheet.getColumnWidth(c);
     } 
     tableHTML.append(" colspan=\"" + colSpan + "\""  + ">" + cellValue);
     c++;
    } else {
     tableHTML.append(">" + cellValue);
     c++;
    }
    tableHTML.append("</TD>");
   }
   tableHTML.append("</TR>");
  }

  tableHTML.append("</TABLE>");

  workbook.close();

System.out.println(tableHTML.toString());

  //creating a sample HTML file 
  String encoding = "UTF-8";
  FileOutputStream fos = new FileOutputStream("result.html");
  OutputStreamWriter writer = new OutputStreamWriter(fos, encoding);
  writer.write("<!DOCTYPE html>\n");
  writer.write("<html lang=\"en\">");
  writer.write("<head>");
  writer.write("<meta charset=\"utf-8\"/>");
  writer.write("<style>");
  writer.write("table {border-collapse: collapse; table-layout: fixed;}");
  writer.write("table, tr, td {border: 1px solid black;}");
  writer.write("td {font: 11pt Calibri, arial, sans-serif;}");
  writer.write("</style>");
  writer.write("</head>");
  writer.write("<body>");

  writer.write(tableHTML.toString());

  writer.write("</body>");
  writer.write("</html>");
  writer.close();

  java.awt.Desktop.getDesktop().browse(new File("result.html").toURI());

 }
}

Result:

enter image description here

Upvotes: 2

Related Questions