Reputation: 15
I am trying to use Apache POI to read an excel file that will have two columns: title and language. Title will have some sentences in a language, language column will be empty. After the Apache POI reads the sentence in the title, it should save it in a variable and then call the language-detect library (https://code.google.com/archive/p/language-detection/). I am especially having an error with the line where there is the case statement
import java.util.ArrayList;
import com.cybozu.labs.langdetect.Detector;
import com.cybozu.labs.langdetect.DetectorFactory;
import com.cybozu.labs.langdetect.Language;
import java.util.Scanner;
import com.cybozu.labs.langdetect.LangDetectException;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Iterator;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class LangDetectSample {
public static void main(String[] args) throws IOException, LangDetectException {
String excelFilePath = "C:\\LD\\Books.xlsx";
FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet firstSheet = workbook.getSheetAt(0); // Assuming that the data is sheet in one
Iterator<Row> iterator = firstSheet.iterator();
DataFormatter formatter = new DataFormatter();
LangDetectSample lang = new LangDetectSample();
//creating variables
String title;
String language;
int rowNumber;
//Blank workbook
XSSFWorkbook wb = new XSSFWorkbook(); //new workbook //fixed
//Create a blank sheet
Sheet sheet1 = wb.createSheet("Predicted language"); //fixed
while (iterator.hasNext())
{
Row nextRow = iterator.next();
rowNumber = nextRow.getRowNum();
Cell cell = nextRow.getCell(2); // title is in column 2
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
title = cell.getStringCellValue();
break;
case Cell.CELL_TYPE_BOOLEAN:
title = formatter.formatCellValue(cell);
break;
case Cell.CELL_TYPE_NUMERIC:
title = formatter.formatCellValue(cell);
break;
}
System.out.print(title);
//Title should now have the title.
// Call the language detector:
language = lang.detect(title);
System.out.println(lang);
// if language detected, attempt to output the result to the new excel file with the following commands:
// Write the title, language
Row row = sheet1.createRow(rowNumber); //changed var
Cell cell2 = row.createCell(2); //changed variable name
cell.setCellValue(title);
Cell cell3 = row.createCell(3);
cell.setCellValue(language);
}
try {
//Write the workbook in file system
FileOutputStream out = new FileOutputStream(new File("title-language.xlsx"));
workbook.write(out);
out.close();
} catch (Exception e)
{
e.printStackTrace();
}
workbook.close();
inputStream.close();
}
public void init(String profileDirectory) throws LangDetectException {
DetectorFactory.loadProfile(profileDirectory);
}
public String detect(String text) throws LangDetectException {
DetectorFactory.loadProfile("C:\\LD\\profiles");
Detector detector = DetectorFactory.create();
detector.append(text);
return detector.detect();
}
public ArrayList detectLangs(String text) throws LangDetectException {
Detector detector = DetectorFactory.create();
detector.append(text);
return detector.getProbabilities();
}
}
The error I am getting is
variable title may not have been initialised
Upvotes: 0
Views: 2878
Reputation: 641
I think you have problems in cases
now in later version poi 4.0.1 CELL_TYPE_NUMERIC
is now just NUMERIC
remove CELL_TYPE_
switch (cell.getCellType()) {
case STRING:
title = cell.getStringCellValue();
break;
case BOOLEAN:
title = formatter.formatCellValue(cell);
break;
case NUMERIC:
title = formatter.formatCellValue(cell);
break;
}
Upvotes: 0
Reputation: 23
for your first error of checking boolean, keep the vaiable of "Object" class e.g
Object title;
switch (cell.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
title = cell.getBooleanCellValue();
break;
}
for your second error , java reads the cell value default in "double " data type format , so you need to convert it to text/ String using following method...
Object title="";
title = new DecimalFormat("0").format(Cell.getNumericCellValue());
hope this will help you...
thanks
Upvotes: 1