hetsch
hetsch

Reputation: 1568

Regex to match a specific pattern multiple times within a sentence

I have the following problem with a latex textfile that consist of multiple sentences, e.g.

Aaa \cref{fig:1}. Bbb \cref{fig:2} bbb \cref{fig:3}. Ccc \cref{fig:4}. Ddd \cref{fig:5} ddd \cref{fig:6} ddd \cref{fig:7}.

What I need to find out is how to isolate the \cref{fig:xxx} parts in each sentence. The problem is that the regex should only account for sentences in which \cref{fig:xxx} occurs more than one times (>1).

A good result would be if the regex could return fig:2 and fig:3 from sentence bbb, as well as fig:5, fig:6, and fig:7 from sentence ddd.

I have to use regular expressions for the search in Textmate (texteditor).

Upvotes: 2

Views: 1277

Answers (2)

Jan
Jan

Reputation: 43169

In addition to my comment, you could come up with a recursive approach. However, looking at the documentation, recursion seems not to be supported in TextMate. In this case, you could easily repeat the pattern one more time (fulfilling your requirement of sentences with more than one occurence):

(?:\\cref\{(fig:\d+)\})(?:[^.]+?(?:\\cref\{(fig:\d+)\}))+

Broken down, this looks for \\cref{} and captures the inner fig:+ digit, then looks for a character that is not a dot ([^.]) and repeats the first subpattern. As already mentionned in the comments, you will likely need to play around with the sentence conditions (e.g. what is considered as a sentence - this is the [^.] part). See a demo of the approach on regex101.com.

Upvotes: 1

bmbigbang
bmbigbang

Reputation: 1378

what you need is a positive lookahead statement. eg:

\S*(?=\s*\\cref{)

note! I'm not sure how to enter escapes and/or symbols in your text program so just to be clear by double "\" I mean the \ char and \s is space char, \S anti space. to return also the fig, you will need to introduce different groups. this guide might help you: http://www.rexegg.com/regex-lookarounds.html#compound

Upvotes: 1

Related Questions