Todd Pierce
Todd Pierce

Reputation: 161

SBCL run-program (Stanford Parser) or Redirecting I/O in Unix

I am running into trouble spawning the Stanford Parser as a child process in SBCL lisp:

(defvar *p* (sb-ext:run-program "/usr/bin/java"
   (list     "-cp"
    "\"/home/todd/CoreNLP/*\""
    "-Xmx2g"
    "edu.stanford.nlp.pipeline.StanfordCoreNLP"
    "-annotators"
    "tokenize,ssplit,pos,lemma,ner,parse,dcoref"
    "-outputFormat"
    "text")
    :wait nil :input :stream :output :stream :error :output))

It looks like it kicks off the program but then the parser dies. I can´t really express everything that´s going on because this text window keeps formatting my text into something else. At any rate, this does not happen with other programs I try to run:

(defvar *g* (sb-ext:run-program "/usr/bin/gnuplot" nil
                                :wait nil
                                :input :stream
                                :output :stream
                                :error :output))

In this case, the program (gnuplot) keeps running.

I´m wondering if this is because it just takes so long for the Stanford Parser to start that lisp gives up on it.

If anybody has any insights into that, I´d be thrilled. It would be the ideal way to talk to the Stanford Parser from within Lisp. Otherwise, I might have a perfectly valid workaround which is to kick off the parser with its input coming from, and output going to, named pipes in the filesystem. This must happen with the command line options above, since the program must be in interactive mode (the parser creates a different type of output if it is not in interactive mode)

This, though, moves a bit off-topic into a Unix question, so this is just if anybody is an expert:

Supposing I had an inpipe and outpipe in the CoreNLP directory, what would be my command line to kick off the parser so its input and output would be connected to the program´s stdin and stdout respectively? Are there any steps I can take (at that point) to make sure that I don´t run into buffering problems later when I access the pipes from within a Lisp program?

Does anybody have any ideas on how to talk to the Stanford Parser from within lisp?

Any insights are appreciated, as always.

-Todd

Upvotes: 3

Views: 374

Answers (1)

anquegi
anquegi

Reputation: 11522

I recommend you to use inferior-shell for executing commands in common lisp.

I never used standford-parser. so I installed it on my Mac whit homebrew, then I can use it as a command line:

 2016-08-26 09:04:06 ☆ |ruby-2.2.3@laguna| Antonios-MBP in ~/learn/lisp/cl-l/stackoverflow/scripts
± |master ?:2 ✗| → lexparser.sh text.txt
[main] INFO edu.stanford.nlp.parser.lexparser.LexicalizedParser - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
 done [0.6 sec].
Parsing file: text.txt
Parsing [sent. 1 len. 42]: The strongest rain ever recorded in India shut down the financial hub of Mumbai , snapped communication lines , closed airports and forced thousands of people to sleep in their offices or walk home during the night , officials said today .
(ROOT
  (S
    (S
      (NP
        (NP (DT The) (JJS strongest) (NN rain))
        (VP
          (ADVP (RB ever))
          (VBN recorded)
          (PP (IN in)
            (NP (NNP India)))))
      (VP
        (VP (VBD shut)
          (PRT (RP down))
          (NP
            (NP (DT the) (JJ financial) (NN hub))
            (PP (IN of)
              (NP (NNP Mumbai)))))
        (, ,)
        (VP (VBD snapped)
          (NP (NN communication) (NNS lines)))
        (, ,)
        (VP (VBD closed)
          (NP (NNS airports)))
        (CC and)
        (VP (VBD forced)
          (NP
            (NP (NNS thousands))
            (PP (IN of)
              (NP (NNS people))))
          (S
            (VP (TO to)
              (VP
                (VP (VB sleep)
                  (PP (IN in)
                    (NP (PRP$ their) (NNS offices))))
                (CC or)
                (VP (VB walk)
                  (NP (NN home))
                  (PP (IN during)
                    (NP (DT the) (NN night))))))))))
    (, ,)
    (NP (NNS officials))
    (VP (VBD said)
      (NP (NN today)))
    (. .)))

det(rain-3, The-1)
amod(rain-3, strongest-2)
nsubj(shut-8, rain-3)
nsubj(snapped-16, rain-3)
nsubj(closed-20, rain-3)
nsubj(forced-23, rain-3)
advmod(recorded-5, ever-4)
acl(rain-3, recorded-5)
case(India-7, in-6)
nmod:in(recorded-5, India-7)
ccomp(said-40, shut-8)
compound:prt(shut-8, down-9)
det(hub-12, the-10)
amod(hub-12, financial-11)
dobj(shut-8, hub-12)
case(Mumbai-14, of-13)
nmod:of(hub-12, Mumbai-14)
conj:and(shut-8, snapped-16)
ccomp(said-40, snapped-16)
compound(lines-18, communication-17)
dobj(snapped-16, lines-18)
conj:and(shut-8, closed-20)
ccomp(said-40, closed-20)
dobj(closed-20, airports-21)
cc(shut-8, and-22)
conj:and(shut-8, forced-23)
ccomp(said-40, forced-23)
dobj(forced-23, thousands-24)
nsubj(sleep-28, thousands-24)
nsubj(walk-33, thousands-24)
case(people-26, of-25)
nmod:of(thousands-24, people-26)
mark(sleep-28, to-27)
xcomp(forced-23, sleep-28)
case(offices-31, in-29)
nmod:poss(offices-31, their-30)
nmod:in(sleep-28, offices-31)
cc(sleep-28, or-32)
xcomp(forced-23, walk-33)
conj:or(sleep-28, walk-33)
dobj(walk-33, home-34)
case(night-37, during-35)
det(night-37, the-36)
nmod:during(walk-33, night-37)
nsubj(said-40, officials-39)
root(ROOT-0, said-40)
nmod:tmod(said-40, today-41)

Parsed file: text.txt [1 sentences].
Parsed 42 words in 1 sentences (18.00 wds/sec; 0.43 sents/sec).

really this execute a shell script whit is, a java command essentially:

 2016-08-26 09:04:24 ☆ |ruby-2.2.3@laguna| Antonios-MBP in ~/learn/lisp/cl-l/stackoverflow/scripts
± |master ?:2 ✗| → cat /usr/local/Cellar/stanford-parser/3.6.0/libexec/lexparser.sh
#!/usr/bin/env bash
#
# Runs the English PCFG parser on one or more files, printing trees only

if [ ! $# -ge 1 ]; then
  echo Usage: `basename $0` 'file(s)'
  echo
  exit
fi

scriptdir=`dirname $0`

java -mx150m -cp "$scriptdir/*:" edu.stanford.nlp.parser.lexparser.LexicalizedParser \
 -outputFormat "penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz $*

then I have all the things so execute it with common lisp:

first install it with quicklisp:

CL-USER> (ql:quickload 'inferior-shell)
To load "inferior-shell":
  Load 1 ASDF system:
    inferior-shell
; Loading "inferior-shell"

(INFERIOR-SHELL)

Then try if it works:

CL-USER> (inferior-shell:run/ss '(lexparser.sh))
"Usage: lexparser.sh file(s)
"
NIL
0

perfect it executes the lexparser and return a string with the standard-output, nil for standard error, and 0 for the execution program.

finally prepare a text, I choose the sample from their web:

text.txt:

The strongest rain ever recorded in India shut down the financial hub of Mumbai, snapped communication lines, closed airports and forced thousands of people to sleep in their offices or walk home during the night, officials said today.

and then when I execute it.

CL-USER> (inferior-shell:run/ss '(lexparser.sh text.txt))
"(ROOT
  (S
    (S
      (NP
        (NP (DT The) (JJS strongest) (NN rain))
        (VP
          (ADVP (RB ever))
          (VBN recorded)
          (PP (IN in)
            (NP (NNP India)))))
      (VP
        (VP (VBD shut)
          (PRT (RP down))
          (NP
            (NP (DT the) (JJ financial) (NN hub))
            (PP (IN of)
              (NP (NNP Mumbai)))))
        (, ,)
        (VP (VBD snapped)
          (NP (NN communication) (NNS lines)))
        (, ,)
        (VP (VBD closed)
          (NP (NNS airports)))
        (CC and)
        (VP (VBD forced)
          (NP
            (NP (NNS thousands))
            (PP (IN of)
              (NP (NNS people))))
          (S
            (VP (TO to)
              (VP
                (VP (VB sleep)
                  (PP (IN in)
                    (NP (PRP$ their) (NNS offices))))
                (CC or)
                (VP (VB walk)
                  (NP (NN home))
                  (PP (IN during)
                    (NP (DT the) (NN night))))))))))
    (, ,)
    (NP (NNS officials))
    (VP (VBD said)
      (NP (NN today)))
    (. .)))

det(rain-3, The-1)
amod(rain-3, strongest-2)
nsubj(shut-8, rain-3)
nsubj(snapped-16, rain-3)
nsubj(closed-20, rain-3)
nsubj(forced-23, rain-3)
advmod(recorded-5, ever-4)
acl(rain-3, recorded-5)
case(India-7, in-6)
nmod:in(recorded-5, India-7)
ccomp(said-40, shut-8)
compound:prt(shut-8, down-9)
det(hub-12, the-10)
amod(hub-12, financial-11)
dobj(shut-8, hub-12)
case(Mumbai-14, of-13)
nmod:of(hub-12, Mumbai-14)
conj:and(shut-8, snapped-16)
ccomp(said-40, snapped-16)
compound(lines-18, communication-17)
dobj(snapped-16, lines-18)
conj:and(shut-8, closed-20)
ccomp(said-40, closed-20)
dobj(closed-20, airports-21)
cc(shut-8, and-22)
conj:and(shut-8, forced-23)
ccomp(said-40, forced-23)
dobj(forced-23, thousands-24)
nsubj(sleep-28, thousands-24)
nsubj(walk-33, thousands-24)
case(people-26, of-25)
nmod:of(thousands-24, people-26)
mark(sleep-28, to-27)
xcomp(forced-23, sleep-28)
case(offices-31, in-29)
nmod:poss(offices-31, their-30)
nmod:in(sleep-28, offices-31)
cc(sleep-28, or-32)
xcomp(forced-23, walk-33)
conj:or(sleep-28, walk-33)
dobj(walk-33, home-34)
case(night-37, during-35)
det(night-37, the-36)
nmod:during(walk-33, night-37)
nsubj(said-40, officials-39)
root(ROOT-0, said-40)
nmod:tmod(said-40, today-41)
"
NIL
0

or I can put the resul in a list:

CL-USER> (multiple-value-list (inferior-shell:run/ss '(lexparser.sh text.txt)))
("(ROOT
  (S
    (S
      (NP
        (NP (DT The) (JJS strongest) (NN rain))
        (VP
          (ADVP (RB ever))
          (VBN recorded)
          (PP (IN in)
            (NP (NNP India)))))
      (VP
        (VP (VBD shut)
          (PRT (RP down))
          (NP
            (NP (DT the) (JJ financial) (NN hub))
            (PP (IN of)
              (NP (NNP Mumbai)))))
        (, ,)
        (VP (VBD snapped)
          (NP (NN communication) (NNS lines)))
        (, ,)
        (VP (VBD closed)
          (NP (NNS airports)))
        (CC and)
        (VP (VBD forced)
          (NP
            (NP (NNS thousands))
            (PP (IN of)
              (NP (NNS people))))
          (S
            (VP (TO to)
              (VP
                (VP (VB sleep)
                  (PP (IN in)
                    (NP (PRP$ their) (NNS offices))))
                (CC or)
                (VP (VB walk)
                  (NP (NN home))
                  (PP (IN during)
                    (NP (DT the) (NN night))))))))))
    (, ,)
    (NP (NNS officials))
    (VP (VBD said)
      (NP (NN today)))
    (. .)))

det(rain-3, The-1)
amod(rain-3, strongest-2)
nsubj(shut-8, rain-3)
nsubj(snapped-16, rain-3)
nsubj(closed-20, rain-3)
nsubj(forced-23, rain-3)
advmod(recorded-5, ever-4)
acl(rain-3, recorded-5)
case(India-7, in-6)
nmod:in(recorded-5, India-7)
ccomp(said-40, shut-8)
compound:prt(shut-8, down-9)
det(hub-12, the-10)
amod(hub-12, financial-11)
dobj(shut-8, hub-12)
case(Mumbai-14, of-13)
nmod:of(hub-12, Mumbai-14)
conj:and(shut-8, snapped-16)
ccomp(said-40, snapped-16)
compound(lines-18, communication-17)
dobj(snapped-16, lines-18)
conj:and(shut-8, closed-20)
ccomp(said-40, closed-20)
dobj(closed-20, airports-21)
cc(shut-8, and-22)
conj:and(shut-8, forced-23)
ccomp(said-40, forced-23)
dobj(forced-23, thousands-24)
nsubj(sleep-28, thousands-24)
nsubj(walk-33, thousands-24)
case(people-26, of-25)
nmod:of(thousands-24, people-26)
mark(sleep-28, to-27)
xcomp(forced-23, sleep-28)
case(offices-31, in-29)
nmod:poss(offices-31, their-30)
nmod:in(sleep-28, offices-31)
cc(sleep-28, or-32)
xcomp(forced-23, walk-33)
conj:or(sleep-28, walk-33)
dobj(walk-33, home-34)
case(night-37, during-35)
det(night-37, the-36)
nmod:during(walk-33, night-37)
nsubj(said-40, officials-39)
root(ROOT-0, said-40)
nmod:tmod(said-40, today-41)
" NIL 0)

remember that this program uses java 8, and I'm using standford-parser 3.6.0

Upvotes: 4

Related Questions