Using NLP to switch genders

Question

Basically I am writing a Java module that is supposed to take English text and switch the genders of the pronouns. So for example, if you give it "She put the box on the table" it would give you back "He put the box on the table." If you gave it "His feet hurt" it would give you back "Her feet hurt."

This is pretty much easy, except for the word "hers." Sometimes his = her, sometimes his = hers.

I've been looking into NLP, which I know pretty much nothing about, and I tried out OpenNLP but it's failing me (I can't use the Standford NLP because of the licensing issue). The POS tagger and the Chunker get confused with her/hers, and so even does the parser. So for example:

The box was his.

(TOP (S (NP (DT The) (NN box)) (VP (VBD was) (NP (PRP$ his))) (. .)))

The box was hers.

(TOP (S (NP (DT The) (NN box)) (VP (VBD was) (ADJP (JJ hers))) (. .)))

The box was his box.

(TOP (S (NP (DT The) (NN box)) (VP (VBD was) (NP (PRP$ his) (NN box))) (. .)))

The box was her box.

(TOP (S (NP (DT The) (NN box)) (VP (VBD was) (NP (PRP$ her) (NN box))) (. .)))

It correctly identifies "hers" as an adjective phrase, but when "his" is used in the predicate in the exact same way, it incorrectly identifies it as a possessive pronoun, as if it was modifying some noun as in the third and fourth examples..

Is this just an issue of training set? Would it be possible to create my own training set that does a better job of doing this, basically a set that just has tons of his/hers sentences?

Bonus points if you can tell me whether there's any way to use NLP to determine the antecedent of a pronoun. For example:

"Wanda gave a watch to a girl named Lucy.  She loved it."

My guess is this is pretty much impossible since this is sometimes even hard for humans.

Using NLP to switch genders

Answers (1)

Related Questions