Reputation: 97
In a legacy application I have a Vector that keeps a chronological list of files to process and multiple threads ask it for the next file to process. (Note that I realize that there are likely better collections to use (feel free to suggest), but I don't have time for a change of that magnitude right now.)
At a scheduled interval, another thread checks the working directory to see if any files appear to have been orphaned because something went wrong. The method called by this thread occasionally throws a ConcurrentModificationException if the system is abnormally busy. So I know that at least two threads are trying to use the Vector at once.
Here is the code. I believe the issue is the use of the clone() on the returned Vector.
private synchronized boolean isFileInDataStore( File fileToCheck ){
boolean inFile = false;
for( File wf : (Vector<File>)m_dataStore.getFileList().clone() ){
File zipName = new File( Tools.replaceFileExtension(fileToCheck.getAbsolutePath(), ZIP_EXTENSION) );
if(wf.getAbsolutePath().equals(zipName.getAbsolutePath()) ){
inFile = true;
break;
}
}
return inFile;
}
The getFileList() method is as follows:
public synchronized Vector<File> getFileList() {
synchronized(fileList){
return fileList;
}
}
As a quick fix, would changing the getFileList method to return a copy of the vector as follows suffice?
public synchronized Vector<File> getFileListCopy() {
synchronized(fileList){
return (Vector<File>)fileList.clone();
}
}
I must admit that I am generally confused by the use of synchronized in Java as it pertains to collections, as simply declaring the method as such is not enough. As a bonus question, is declaring the method as synchronized and wrapping the return call with another synchronized block just crazy coding? Looks redundant.
EDIT: Here are the other methods which touch the list.
public synchronized boolean addFile(File aFile) {
boolean added = false;
synchronized(fileList){
if( !fileList.contains(aFile) ){
added = fileList.add(aFile);
}
}
notifyAll();
return added;
}
public synchronized void removeFile( File dirToImport, File aFile ) {
if(aFile!=null){
synchronized(fileList){
fileList.remove(aFile);
}
// Create a dummy list so I can synchronize it.
List<File> zipFiles = new ArrayList<File>();
synchronized(zipFiles){
// Populate with actual list
zipFiles = (List<File>)diodeTable.get(dirToImport);
if(zipFiles!=null){
zipFiles.remove(aFile);
// Repopulate list if the number falls below the number of importer threads.
if( zipFiles.size()<importerThreadCount ){
diodeTable.put(dirToImport, getFileList( dirToImport ));
}
}
}
notifyAll();
}
}
Upvotes: 3
Views: 870
Reputation: 169
First, you should move your logic to whatever class is m_dataStore if you haven't.
Once you've done that, make your list final, and synchronize on it ONLY if you are modifying its elements. Threads that only need to read it, don't need synchronized access. They may end up polling an outdated list, but I suppose that is not a problem. This gets you increased performance.
As far as I can tell, you would only need to synchronize when adding and removing, and only need to lock your list.
e.g.
package answer;
import java.util.logging.Level; import java.util.logging.Logger;
public class Example {
public static void main(String[] args)
{
Example c = new Example();
c.runit();
}
public void runit()
{
Thread.currentThread().setName("Thread-1");
new Thread("Thread-2")
{
@Override
public void run() {
test1(true);
}
}.start();
// Force a scenario where Thread-1 allows Thread-2 to acquire the lock
try {
Thread.sleep(1000);
} catch (InterruptedException ex) {
Logger.getLogger(Example.class.getName()).log(Level.SEVERE, null, ex);
}
// At this point, Thread-2 has acquired the lock, but it has entered its wait() method, releasing the lock
test1(false);
}
public synchronized void test1(boolean wait)
{
System.out.println( Thread.currentThread().getName() + " : Starting...");
try {
if (wait)
{
// Apparently the current thread is supposed to wait for some other thread to do something...
wait();
} else {
// The current thread is supposed to keep running with the lock
doSomeWorkThatRequiresALockLikeRemoveOrAdd();
System.out.println( Thread.currentThread().getName() + " : Our work is done. About to wake up the other thread(s) in 2s...");
Thread.sleep(2000);
// Tell Thread-2 that it we have done our work and that they don't have to spare the CPU anymore.
// This essentially tells it "hey don't wait anymore, start checking if you can get the lock"
// Try commenting this line and you will see that Thread-2 never wakes up...
notifyAll();
// This should show you that Thread-1 will still have the lock at this point (even after calling notifyAll).
//Thread-2 will not print "after wait/notify" for as long as Thread-1 is running this method. The lock is still owned by Thread-1.
Thread.sleep(1000);
}
System.out.println( Thread.currentThread().getName() + " : after wait/notify");
} catch (InterruptedException ex) {
Logger.getLogger(Example.class.getName()).log(Level.SEVERE, null, ex);
}
}
private void doSomeWorkThatRequiresALockLikeRemoveOrAdd()
{
// Do some work that requires a lock like remove or add
}
}
Upvotes: 0
Reputation: 42617
"I am generally confused by the use of synchronized in Java as it pertains to collections, as simply declaring the method as such is not enough"
Correct. synchronized
on a method means that only one thread at a time may enter the method. But if the same collection is visible from multiple methods, then this doesn't help much.
To prevent two threads accessing the same collection at the same time, they need to synchronize on the same object - e.g. the collection itself. You have done this in some of your methods, but isFileInDataStore
appears to access a collection returned by getFileList
without synchronizing on it.
Note that obtaining the collection in a synchronized manner, as you have done in getFileList
, isn't enough - it's the accessing that needs synchronizing. Cloning the collection would (probably) fix the issue if you only need read-access.
As well as looking at synchronizing, I suggest you track down which threads are involved - e.g. print out the call stack of the exception and/or use a debugger. It's better to really understand what's going on than to just synchronize and clone until the errors go away!
Upvotes: 1
Reputation: 10020
Basically, there are two separate issues here: sycnhronization and ConcurrentModificationException. Vector
in contrast to e.g. ArrayList
is synchronized internally so basic operation like add()
or get()
do not need synchronization. But you can get ConcurrentModificationException
even from a single thread if you are iterating over a Vector
and modify it in the meantime, e.g. by inserting an element. So, if you performed a modifying operation inside your for
loop, you could break the Vector
even with a single thread. Now, if you return your Vector
outside of your class, you don't prevent anyone from modifyuing it without proper synchronization in their code. Synchronization on fileList
in the original version of getFileList()
is pointless. Returning a copy instead of original could help, as could using a collection which allows modification while iterating, like CopyOnWriteArrayList
(but do note the additional cost of modifications, it may be a showstopper in some cases).
Upvotes: 5
Reputation: 875
Where does the m_dataStore get updated? That's a likely culprit if it's not synchronized.
Upvotes: 0