Marsellus Wallace
Marsellus Wallace

Reputation: 18601

Java language-detection returning different probabilities given the same input

I'm using the Java language-detection library in the following way:

Detector detector = DetectorFactory.create(); //profiles are in the default location
detector.append("What language is this text?");
List<Language> languages = detector.getProbabilities();
Language mostProbable = languages.get(0);
System.out.println(mostProbable.lang + " - " + mostProbable.prob);

The prob value varies slightly from execution to execution given the exact same input. Is that "normal"? What does that depend on?

Upvotes: 1

Views: 336

Answers (1)

sdasdadas
sdasdadas

Reputation: 25096

If the algorithm / method the library is using is not deterministic then the values may vary per execution.

For example, some algorithms need to be given an initial seed to begin. In a lot of cases, this seed is (pseudo)-randomly chosen. This can affect the final output.

EDIT: It looks like that library is using Naive Bayesian Classifiers (which can probably be either or).

Upvotes: 2

Related Questions