Frances
Frances

Reputation: 13

Tcl - How to replace ? with -

(You'd think this would be easy, but I'm stumped.)

I'm converting an iOS note to a text file, and the note contains "0." and "?" whenever there is a list or bullet.

This was a bulleted list
? item 20
? Item 21
? Item 22

I'm having so much problem replacing the "?"

I don't want to replace a legitimate question mark at the end of a sentence, but I want to replace the "?" bullets with "-" (preferably anywhere in the line, not just at the beginning)

I tried these searches - no luck

set line "? item 20"
set index_bullet [string first "(\s|\r|\n)(\?)" $line]
set index_bullet [string first "(!\w)(\?)" $line]
set index_bullet [string first ^\? $line]

This works, but it would match any question mark

set index_bullet [string first \? $line]

Does anyone know what I'm doing wrong? How do I find and replace only question mark bullets with a "-"?

Thank you very much in advance

Upvotes: 1

Views: 3152

Answers (3)

Donal Fellows
Donal Fellows

Reputation: 137587

If you're really wanting to replace a question mark where you've got a regular expression that describes the rule, the regsub command is the right way. (The string first command finds literal substrings only. The string match command uses globbing rules.) In this case, we'll use the -all option so that every instance is replaced:

set line "? item 20"
set replaced [regsub -all {(\s|^)\?(\s)} $line {\1-\2}]
puts "'$line' --> '$replaced'"
# Prints: '? item 20' --> '- item 20'

The main tricks to using regular expressions in Tcl are, as much as possible, to keep REs and their replacements in braces so that the you can use Tcl metacharacters (e.g., backslash or square brackets) without having to fiddle around a lot.

Also, \s by default will match a newline.

Upvotes: 1

Peter Lewerin
Peter Lewerin

Reputation: 13252

It seems likely that a character used to indicate a list item is the first character on the line or the first character after optional whitespace. To match a question mark at the beginning of a line:

string match {\?*} $line

or

string match \\?* $line

The braces or doubled backslash keeps the question mark from being treated as a string match metacharacter.

To find a question mark after optional whitespace:

string match {\?*} [string trimleft $line]

The command returns 1 if it finds a match, and 0 if it doesn't.

To do this with string first, use

if {[string first ? [string trimleft $line]] eq 0} ... 

but in that case, keep in mind that the index returned from string first isn't the true location of the question mark. (Use == instead of eq if you have an older Tcl).

When you have determined that the line contains a question mark in the first non-whitespace position, a simple

set line [regsub {\?} $line -]

will perform a single substitution regardless of where it is.

Documentation: regsub, string, Syntax of Tcl regular expressions

Upvotes: 1

Frances
Frances

Reputation: 13

I figured it out. I did it in two steps:

1) First find the "?"

 set index_bullet [string first "\?" $line]

2) Then filter out "?" that is not a bullet

 set index_question_mark [string first "\w\?" $line]

I have a solution, but please post if you have a better way of doing this. Thanks!

Upvotes: 0

Related Questions