jojo
jojo

Reputation: 333

How to Multi-threading this scenario of problem?

I would like to make a simulation of a distributed system, in which, I should make a research for information(supplies) in a distributed (parallel if I could!!) way, for example I have the following class:

public class Group{
public int identifier;
public int[] members;
public String name; 
public String[] supplies;
public int[] neighbors;}

There are many groups, each one has a name and consists of a list of members, neighbors and supplies, each member has some information and list to other groups that may contain pertinent information and supplies, and so on.

1- I want to make a research for supplies, First: inside one group, if I do not find the required supply, I should make a search inside all groups which are neighbors to the this group, I think to make this using Multi-threading, I mean, if the search was failed I should make a search inside all the other neighbors at the same time using multiple threads, each one take in consideration one neighbor, If I have 10 neighbors then 10 threads should be created....

2- Now, if I want to begin the re-search at the first time with several groups, I mean to begin with 3 or 4 groups or more, each one look for a different supply, or the same.... + one group which invoke the search could be a neighbor for another group...

So, my question is How to achieve this scenario using threads ?

PS.I have a machine with a single processor with one core, and I do not care about a time of execution (the overhead), all I want is to simulate this problem...

Thanks for every response, and best regards.

Upvotes: 1

Views: 1060

Answers (3)

Briguy37
Briguy37

Reputation: 8412

I don't see any performance advantage to multi-threading this on a single CPU machine. This is because only 1 thread will be able to run at a time and there will be switching time between threads, thus it will probably actually take more time to find a group with the desired resource.

Personally, I'd just iterate through the first group's neighbors and check them for resources. Then, if the resources were not found, I'd call the search on each of the sub-groups, passing in the list of groups that were already checked, so it can skip groups that have already been checked. Something like:

public Group searchForGroupWithResource(Resource resource){
    List<Group> groupsToCheck = new ArrayList<Group>();
    groupsToCheck.add(this);
    int currentIndex = 0;
    while(currentIndex < groupsToCheck.size()){
        Group currentGroup = groupsToCheck.get(currentIndex);
        if(currentGroup.hasResource(resource)){
            return currentGroup;
        }
        groupsToCheck.addAll(currentGroup.getNeighbors(groupsToCheck));
        currentIndex++;
    }
    return null;
}

public List<Group> getNeighbors(List<Group> excludeGroups){
    //Get non-excluded neighbors
    List<Group> returnNeighbors = new ArrayList<Group>();
    for(Group neighbor : neighbors){
        boolean includeGroup = true;
        for(Group excludeGroup : excludeGroups){
            if(excludeGroup.equals(neighbor)){
                includeGroup = false;
                break;
            }
        }
        if(includeGroup){
            returnNeighbors.add(neighbor);
        }
    }
    return returnNeighbors;
}

Note: If you still decide to go for the multi-threading, I would suggest a common object that stores information about the search that is accessible to all threads. This would specify the Groups that were checked (so you don't check the same group twice) and whether the required supplies were found (so you can stop checking resources).

Upvotes: 1

Gangadhar
Gangadhar

Reputation: 1903

I could not understand the second requirement, but for the first, here is a possible approach. Before that though, technically your process is not completely CPU bound, it is also I/O bound (network) too. So, please don't assume that making it muti-threaded will provide the required speedup you are looking for. I am assuming that your development environment is uni-processor and single core, but your deployment environment may not be.

Back to the suggestion. I would create a GroupPool class that has a pool of threads that can go scout for information. The number of threads will be configurable via a runtime config parameter. You can create a factory class which reads this parameter from a config file and creates a pool of runnable objects.

Each of these objects represent one connection to the neighboring node. [TODO] You did not mention if you'd like to recurse on the supplier nodes i.e. if you don't find the information in a supplier node, do you want to search the supplier, the supplier's suppliers etc. If so, you will have the problem of cycle detection. Once these thread objects scout for information and find it, they update a semaphore on the factory object (you might want to move this to a separate object as that will be a better design), and also send the supplier id (see, a separate object does make sense)

You can have a listener for this modified semaphore and as soon as the value changes, you know you found your information and get the supplier id from that object. Once you get your information, you can send a notification to the thread-pool to shutdown the runnable objects as you already found your information.

Based on whether you are looking for a binary answer (find data and any supplier is ok) and if you want to recurse, the complexity of the above will increase.

Hope this helps in you trying to design the structure for your problem.

Upvotes: 1

Peter Lawrey
Peter Lawrey

Reputation: 533870

Since you have a CPU bound problem, the optimal number of threads to use is likely to be the number of cores you have. I would ensure each thread has about 100 micro-seconds of work or you could find you have more over head than useful work. e.g. you might find that searching 10K nodes is about 100 us work. If you are not careful, a multi-threaded application can be many times slower than a single threaded one.

So I would find a way to divide up the work so you have about 1K to 100K nodes for each thread and limit your concurrency to the number of core you have.

Upvotes: 1

Related Questions