tastyminerals
tastyminerals

Reputation: 6548

How to match a sentence in Lua

I am trying to create a regex which attempts to match a sentence.

Here is a snippet.

local utf8 = require 'lua-utf8'
function matchsent(text)
  local text = text
  for sent in utf8.gmatch(text, "[^\r\n]+\.[\r\n ]") do
    print(sent)
    print('-----')
  end
end

However, it does not work like in python for example. I know that Lua uses different set of regex patterns and it's regex capabilities are limited but why does the regex above give me a syntax error? And how a sentence matching regex in Lua would look like?

Upvotes: 3

Views: 895

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627336

Note that Lua uses Lua patterns, that are not "regular" expressions as they cannot match a regular language. They can hardly be used to split a text into sentences since you'd need to account for various abbreviations, spacing, case etc. To split a text into sentences, you need an NLP package rather than one or two regexps due to the complexity of the task.

Regarding

why does the regex above give me a syntax error?

you need to escape special symbols with a % symbol in Lua patterns. See an example code:

function matchsent(text)
    for sent in string.gmatch(text, '[^\r\n]+%.[\r\n ]') do
        print(sent)
        print("---")
    end
end
matchsent("Some text here.\nShow me")

An online demo

Upvotes: 2

Related Questions