Reputation: 1806
I have a number of strings that I would like to search and reformat in a file. I'm using gsed v4.7 on MacOS 10.14.6 to do this. My goal is to break the strings up into backreferences so that I can then reformat.
Here is a single example of a candidate being transformed:
vib.h.p.a#3.synt 8
would become
vib.h.p.a#3.8.synt
...note that the number 8
is removed from the end and spliced between #3
and synt
, separated by dots.
Here is a list of candidates:
vib.h.p.f2.synt 4
vib.h.p.g#2.synt 7
vib.h.p.a#3.synt 8
If you look at the components of this exemplary string, they can be broken down into groups fairly easily.
I cannot find a way to formalize this into an expression that matches the needs of gsed
.
Here is what I have tried:
gsed -r 's/(vib\.+)\.(.+)\s(\d)/\1.\3.\2/g' myfile.txt
gsed -r 's/vib\.(.*)\.(.*)\s(\d)/vib.\1\3\2/g' myfile.txt
gsed -r 's/(vib\..*)\.(.*)\s(\d)/\1.\3.\2/g' myfile.txt
I know that I'm missing something critical, possibly a way to lookahead negatively? My intuition tells me that I am close to a solution, although I've given up for the night.
EDIT 12/16/19 - The answer below by @Wiktor suggested a command like
gsed -r 's/(vib.+)\.(.+)[[:blank:]]+([0-9]+)/\1.\3.\2/g' myfile.txt
This does not print the desired transformation on my machine. Instead, it prints the original text without any substitutions, as it is not matching successfully. I am unable to test on another machine, so I do not know if this is the correct answer, but I have tried all variants suggested, including using [[:space:]]
, [[:blank:]]
, [0-9]
, and +
instead of *
. If anyone can help I would appreciate it.
Upvotes: 1
Views: 142
Reputation: 46856
This seems like a simple one to me. What am I missing?
echo "vib.h.p.f2.synt 4" | sed -E 's/(.*[0-9]+)(\.[^0-9]+) ([0-9]+)$/\1.\3\2/g'
vib.h.p.f2.4.synt
Note that this was done with stock sed in macOS, where -E
gets you ERE.
Note also that this could be done using character classes, like this:
... sed -E 's/(.*[[:digit:]]+)(\.[^[:digit:]]+) ([[:digit:]]+)$/\1.\3\2/g'
But if you need to use character classes, you probably already know that. :)
Upvotes: 0
Reputation: 9619
Use this regex:
([.#0-9a-zA-Z]+\.)(\S*)\s+([0-9]+)
and replace with $1$3.$2
Upvotes: 1
Reputation: 1806
I think I finally found something that does the replacement I was hoping for.
gsed -r 's/(vib.\w.)(\w+.(\w[0-9]|\w\#[0-9]).)(\w+)\s([0-9])/\1\2\5.\4/g' myfile.txt
This works for my needs, but there is probably a be a far more elegant way. I'm including the text I used as a test here, in the event that someone can figure out what a better solution would be.
Upvotes: 0
Reputation: 627082
You may use
gsed -r 's/(vib.+)\.(.+)[[:blank:]]+([0-9]+)/\1.\3.\2/g' myfile.txt
The main points:
\.+
matches one or more dots, not any one or more chars, hence you need to remove the backslash\d
and \s
are not quite portable and thus it makes sense to replace \d
with [0-9]
and \s
with a space or [[:blank:]]
+
(since you use -r
option the POSIX ERE syntax will treat +
as a one or more occurrences quantifier).Upvotes: 0