Reputation: 60
Hey StackOverflow community, I am currently trying to write a little tool, that reads a shapefiles geometries (Multipolygons / Polygons) and writes the WKT-representations of those into a text file. To do so, I am using GeoTools and I managed to get it running fine, due to the fact that I am converting files with about 5000000 Polygons / Multipolygons, it takes pretty long to finish.
So my question is:
Is it possible to fasten up the file loading/writing? As I am using a SimpleFeatureIterator I did not find out how to implement multithreading.
Is there a way to do so? Or does anyone know, how to get the shapefiles geometries without using an iterator?
This is my code:
This method is just stating the File Chooser and starting the thread for each selected file.
protected static void printGeometriesToFile() {
JFileChooser chooser = new JFileChooser();
FileNameExtensionFilter filter = new FileNameExtensionFilter(
"shape-files", "shp");
chooser.setFileFilter(filter);
chooser.setDialogTitle("Choose the file to be converted.");
chooser.setMultiSelectionEnabled(true);
File[] files = null;
int returnVal = chooser.showOpenDialog(null);
if (returnVal == JFileChooser.APPROVE_OPTION) {
files = chooser.getSelectedFiles();
}
for (int i = 0; i < files.length; i++) {
MultiThreadWriter writer = new MultiThreadWriter(files[i]);
writer.start();
}
}
The class for multithreading:
class MultiThreadWriter extends Thread {
private File threadFile;
MultiThreadWriter(File file) {
threadFile = file;
System.out.println("Starting Thread for " + file.getName());
}
public void run() {
try {
File outputFolder = new File(threadFile.getAbsolutePath() + ".txt");
FileOutputStream fos = new FileOutputStream(outputFolder);
System.out.println("Now writing data to file: " + outputFolder.getName());
FileDataStore store = FileDataStoreFinder.getDataStore(threadFile);
SimpleFeatureSource featureSource = store.getFeatureSource();
SimpleFeatureCollection featureCollection = featureSource.getFeatures();
SimpleFeatureIterator featureIterator = featureCollection.features();
int pos = 0;
while (featureIterator.hasNext()) {
fos.write((geometryToByteArray((Polygonal) featureIterator.next().getAttribute("the_geom"))));
pos++;
System.out.println("The file " + threadFile.getName() + "'s current positon is: " + pos);
}
fos.close();
System.out.println("Finished writing.");
} catch (IOException e) {
e.printStackTrace();
}
}
}
This is just a helper function that converts the Multipolygons to polygons and returns its WKT-representation with a "|" as a seperator.
private byte[] geometryToByteArray(Polygonal polygonal) {
List<Polygon> polygonList;
String polygonString = "";
if (polygonal instanceof MultiPolygon) {
polygonList = GeometrieUtils.convertMultiPolygonToPolygonList((MultiPolygon) polygonal);
//The method above just converts a MultiPolygon into a list of Polygons
} else {
polygonList = new ArrayList<>(1);
polygonList.add((Polygon) polygonal);
}
for (int i = 0; i < polygonList.size(); i++) {
polygonString = polygonString + polygonList.get(i).toString() + "|";
}
return polygonString.getBytes();
}
}
I know my code is not pretty or good. I have just started learning Java and hope it will become better soon.
sincerely
ihavenoclue :)
Upvotes: 1
Views: 339
Reputation: 28269
You do not need create a new thread for every file, because creating new thread is an expensive operation. Instead, you can let MultiThreadWriter
implements Runnable
and use ThreadPoolExecuter
manage all threads.
MultiThreadWriter
public class MultiThreadWriter implements Runnable {
@Override
public void run() {
//
}
}
Create thread pool matches your runtime processors.
ExecutorService service = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
for (int i = 0; i < files.length; i++) {
MultiThreadWriter writer = new MultiThreadWriter(files[i]);
service.submit(writer);
}
You can use BufferedWriter
instead OutputStream
, it is more
efficient when you repeatly write small pieces.
File outputFolder = new File(threadFile.getAbsolutePath() + ".txt");
FileOutputStream fos = new FileOutputStream(outputFolder);
BufferedWriter writer = new BufferedWriter(fos);
Upvotes: 1
Reputation: 1748
I would prefere to read files content as a list of objects, then split the list onto sublists, then create a thread to each list, example :
int nbrThreads = 10;
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(nbrThreads);
int count = myObjectsList != null ? myObjectsList.size() / nbrThreads : 0;
List<List<MyObject>> resultlists = choppeList(myObjectsList, count > 0 ? count : 1);
try
{
for (List<MyObject> list : resultlists)
{
// TODO : create your thread and passe the list of objects
}
executor.shutdown();
executor.awaitTermination(30, TimeUnit.MINUTESS); // chose time of termination
}
catch (Exception e)
{
LOG.error("Problem launching threads", e);
}
The choppeList method can be like that :
public <T> List<List<T>> choppeList(final List<T> list, final int L)
{
final List<List<T>> parts = new ArrayList<List<T>>();
final int N = list.size();
for (int i = 0; i < N; i += L)
{
parts.add(new ArrayList<T>(list.subList(i, Math.min(N, i + L))));
}
return parts;
}
Upvotes: 0