Reputation: 381
I am following the steps mentioned here - http://www.nltk.org/book/ch10.html to load and parse data using a cfg file. When I use the code below I don't face any issue.
cp = load_parser('grammars/book_grammars/sql0.fcfg')
query = 'What cities are located in China'
trees = list(cp.parse(query.split()))
answer = trees[0].label()['SEM']
answer = [s for s in answer if s]
q = ' '.join(answer)
print(q)
What I wish to do is take out the sql0.fcfg, make changes to it and load it into the parser again to test it with my own sentences. It is here that I run into issues.
I copied the contents of the sql0.fcg file into a txt file, stored in my local system, renamed it as .cfg but when I am parsing it like below I get an error saying nltk.download('C:').
cp = load_parser('C:/Users/212757677/Desktop/mygrammar.fcfg')
The second method I tried was to copy the grammar from the fcfg file and try to load it in the following manner. Here I get an error saying 'Unable to parse line 2. Expected arrow'
import nltk
groucho_grammar = nltk.CFG.fromstring("""
S[SEM=(?np + WHERE + ?vp)] -> NP[SEM=?np] VP[SEM=?vp]
VP[SEM=(?v + ?pp)] -> IV[SEM=?v] PP[SEM=?pp]
VP[SEM=(?v + ?ap)] -> IV[SEM=?v] AP[SEM=?ap]
NP[SEM=(?det + ?n)] -> Det[SEM=?det] N[SEM=?n]
PP[SEM=(?p + ?np)] -> P[SEM=?p] NP[SEM=?np]
AP[SEM=?pp] -> A[SEM=?a] PP[SEM=?pp]
NP[SEM='Country="greece"'] -> 'Greece'
NP[SEM='Country="china"'] -> 'China'
Det[SEM='SELECT'] -> 'Which' | 'What'
N[SEM='City FROM city_table'] -> 'cities'
IV[SEM=''] -> 'are'
A[SEM=''] -> 'located'
P[SEM=''] -> 'in'
""")
cp = load_parser(groucho_grammar)
query = 'What cities are located in China'
trees = list(cp.parse(query.split()))
answer = trees[0].label()['SEM']
answer = [s for s in answer if s]
q = ' '.join(answer)
print(q)
ValueError: Unable to parse line 2: S[SEM=(?np + WHERE + ?vp)] -> NP[SEM=?np] VP[SEM=?vp]
Expected an arrow
I just want to edit the existing grammar in sql0.fcfg and parse it. Can someone suggest how to go about this ?
Upvotes: 2
Views: 1435
Reputation: 241791
The prototype for nltk.load_parser
is
nltk.load_parser(grammar_url, trace=0, parser=None, chart_class=None, beam_size=0, **load_args)
Note that the first argument is a "url", not just a file path (See the data Module documentation for a very brief explanation). An nltk URL starts with a protocol followed by a colon, so it will interpret C:
as a protocol. You should probably be explicit: file:C:/Users/212757677/Desktop/mygrammar.fcfg
. (Or perhaps it's file:///C:/Users/212757677/Desktop/mygrammar.fcfg
-- I don't have a Windows machine to test it on.)
nltk.load_parser
guesses the grammar format based on the filename extension. In this case, you're loading a feature grammar (.fcfg
), not a simple CFG. If you want to create the parser manually, you should follow the example in the NLTK how-to on feature grammar parsing.
Upvotes: 1