Reputation: 353
I have a TCL script that reads some data from another file:
#!/usr/bin/tclsh
set config_params [open "file.vh" r]
set file_data [read $config_params]
close $config_params
In the file that I'm reading (file.v) there are few entire lines that are comment it out for example:
//---------------------
// Comment
// more comment
// --------------------
// more comment
My question is how do I get rid of all the lines that are starting wih // and also remove the empty lines? I have used regsub as follow:
regsub -all -line "//" $file_data "" file_data
but this just removes the // sign in the begining of the line and the comment is still there.
Upvotes: 0
Views: 3329
Reputation: 13252
This code:
set int [interp create]
$int eval {namespace delete ::}
$int alias // apply {args {}}
$int alias unknown apply {args {puts $args}}
followed by the invocation
$int eval $file_data
prints the contents of the file with empty lines and lines starting with "//" removed (but see below).
If you want the output to go to a file, use this:
set f [open whatever w]
$int alias unknown apply {{f args} {puts $f $args}} $f
This code works by creating a slave interpreter, removing all commands from it, and then giving it two commands: the //
command is executed by lines beginning with //
(but not lines beginning with //---
...) and does nothing, while the unknown
command is executed by all other lines and prints the line.
The most serious limitation is that each line is parsed according to Tcl command syntax, so braces and brackets need to be balanced, brackets trigger command substitution, $ triggers variable substitution, etc. For text data with lines commented out it could well work, but for C++ source it probably won't.
This is better handled by a good text editor, however. E.g. g/\v(^\s*\/\/|^$)/d
in Vim.
Documentation: apply, interp, namespace, open, puts, set, unknown (object method), unknown
Upvotes: 0
Reputation: 353
Managed to do it as:
regsub -all -line "//(.*)\n" $file_data "" file_data
Upvotes: 0
Reputation: 137557
tl;dr: Use this RE (in braces because of Tcl metacharacters): {(?:^[ \t]*|//.*)(?:\n|\Z)}
and consider whether this is the right approach.
OK, you're in line-matching mode, so to actually remove the line you need to put \n
to include the EOL marker. And that needs to also include \Z
(which is like a super-$
for line-matching mode) as an alternative in case the final line isn't terminated. Then, to match the rest of the data you need two possible cases: either a line that has nothing but whitespace on it from start to end, or //
followed by any characters (except newline; we're in line-matching mode). Some non-capturing groups and alternation wrap the whole thing up.
set data "abc
def
// ghi
jk
// lm
nopq"
puts [regsub -all -line {(?:^[ \t]*|//.*)(?:\n|\Z)} $data ""]
That produces this output:
abc def jk nopq
I'm not sure that the RE is what you really want (you probably want to be more selective about removal rules for comment lines) but it works with a reasonable sample case. In my own code, I'd probably process the text by splitting into lines first and using much simpler REs; it would be more obviously correct to me rather than using a complicated RE where I'd have to think more each time about whether it is doing the right thing. RE monsters are usually a bad idea, maintainability-wise.
Upvotes: 1