Reputation: 17480
I have the following specs
it "parses a document with only an expression" do
puts parser.document.should parse("[b]Hello World[/b]")
end
it "parses a document with only text" do
puts parser.document.should parse(" Hello World")
end
it "parses a document with both an expression and text" do
puts parser.document.should parse("[b]Hello World[/b] Yes hello")
end
For the following Parslet Parser
class Parser < Parslet::Parser
rule(:open_tag) do
parslet = str('[')
parslet = parslet >> (str(']').absent? >> match("[a-zA-Z]")).repeat(1).as(:open_tag_name)
parslet = parslet >> str(']')
parslet
end
rule(:close_tag) do
parslet = str('[/')
parslet = parslet >> (str(']').absent? >> match("[a-zA-Z]")).repeat(1).as(:close_tag_name)
parslet = parslet >> str(']')
parslet
end
rule(:text) { any.repeat(1).as(:text) }
rule(:expression) do
# [b]Hello World[/b]
# open tag, any text up until closing tag, closing tag
open_tag.present?
close_tag.present?
parslet = open_tag >> match("[a-zA-Z\s?]").repeat(1).as(:enclosed_text) >> close_tag
parslet
end
rule(:document) do
expression | text
end
The first two tests pass just fine, and I can see by put
ing them out to the command line that the atoms are of the correct type. However, when I try to parse a document with both an expression and plain text, it fails to parse the plain text, failing with the following error
Parslet::UnconsumedInput: Don't know what to do with " Yes hello" at line 1 char 19.
I think I'm missing something regarding defining the :document rule. What I want is something that will consume any number of in sequence expressions and plain text, and while the rule I have will consume each atom individual, using them both in the same string causes failure.
Upvotes: 2
Views: 384
Reputation: 21548
What you were looking for is something like this...
require 'parslet'
class ExampleParser < Parslet::Parser
rule(:open_tag) do
str('[') >>
match["a-zA-Z"].repeat(1).as(:open_tag_name) >>
str(']')
end
The open_tag rule doesn't need to exclude the ']' character as the match only allows letters.
rule(:close_tag) do
str('[/') >>
match["a-zA-Z"].repeat(1).as(:close_tag_name) >>
str(']')
end
same here
rule(:text) do
(open_tag.absent? >>
close_tag.absent? >>
any).repeat(1).as(:text)
end
If you exclude the open and close tags here.. you know you are only dealing with text. Note: I like this technique of using "any" once you have excluded the things you don't want, but bare it in mind if you are refactoring later as your exclusion list may need to grow. Note2: You could simplify this further as below..
rule(:text) do
(str('[').absent? >> any).repeat(1).as(:text)
end
.. if you don't want any square brackets in your text at all.
rule(:expression) do
# [b]Hello World[/b]
open_tag >> text.as(:enclosed_text) >> close_tag
end
This becomes much simpler as the text can't include a close_tag
rule(:document) do
(expression | text).repeat
end
I've added in the repeat you missed (as pointed out by matt)
end
require 'rspec'
require 'parslet/rig/rspec'
describe 'example' do
let(:parser) { ExampleParser.new }
context 'document' do
it "parses a document with only an expression" do
parser.document.should parse("[b]Hello World[/b]")
end
it "parses a document with only text" do
parser.document.should parse(" Hello World")
end
it "parses a document with both an expression and text" do
parser.document.should parse("[b]Hello World[/b] Yes hello")
end
end
end
RSpec::Core::Runner.run([])
Hope this give you some tips on using Parslet. :)
Upvotes: 4
Reputation: 79783
For your document
rule you want to use repeat
:
rule(:document) do
(expression | text).repeat
end
You’ll also need to change your text
rule; currently if it starts matching it will consume everything including any [
that should start a new expression
. Something like this should work:
rule(:text) { match['^\['].repeat(1).as(:text) }
Upvotes: 2