Reputation: 3022
I need advice on best way to search with regex in vim and extract any matches that are discovered.
I have a csv file that looks something like this:
Two fields:
id
description
0g98932,"long description sometimes containing numbers like 1234567, or 0000012345 and even BR00012345 but always containing text"
I need to search the description field on each row. If a number matching \d{10} exists in the second field, I want to pull it out.
doing something like :% s/(\d{10})/^$1/g
gives me a
Pattern not found (\d{10}) error.
I've never learned how to grab and reference a match from a regex search in vim - so that's part of the problem.
The other part:
I would really like to either.
Upvotes: 2
Views: 3702
Reputation: 89073
The important thing to know about vim regexes is that different levels are escaping are required (as opposed to, say, regexes in Perl or Ruby)
From :help /\m
after: \v \m \M \V matches
'magic' 'nomagic'
$ $ $ \$ matches end-of-line
. . \. \. matches any character
* * \* \* any number of the previous atom
() \(\) \(\) \(\) grouping into an atom
| \| \| \| separating alternatives
\a \a \a \a alphabetic character
\\ \\ \\ \\ literal backslash
\. \. . . literal dot
\{ { { { literal '{'
a a a a literal 'a'
The default setting is 'magic', so to make the regex you gave worked, you'd have to use:
:%s/".*\(\d\{10}\).*"/\1/
If you want to delete everything other than the first 7 digit id and the matches (by which I assume you mean that you want to delete lines without any match)
:v/^\([[:alnum:]]\{7}\),\s*".*\(\d\{10}\).*/d
:%s//\1,\2/
The :v/<pattern>/
command allows you to run a command on each line that doesn't match
the given pattern, so this just deletes the non-matches. :s//
reuses the prior pattern,
so we don't have to specify it.
This transforms the following:
0g98932,"long description sometimes containing numbers like 0123456789"
0g98932,"long description no numbers"
0g98932,"long description no numbers"
0g98932,"long description sometimes containing numbers like 0123456789"
0g98932,"long description no numbers"
0g98932,"long description no numbers"
0g98932,"long description no numbers"
0g98932,"long description no numbers"
0g98932,"long description sometimes containing numbers like 0123456789"
0g98932,"long description no numbers"
0g98932,"long description no numbers"
0g98932,"long description sometimes containing numbers like 0123456789"
into this:
0g98932,0123456789
0g98932,0123456789
0g98932,0123456789
0g98932,0123456789
Upvotes: 6
Reputation: 59844
To grab match you have to use
\(pattern\)
To delete use
:%s/not_pattern\(pattern\)another_not_pattern/\1/
Upvotes: 3