Reputation: 805
I want to search for all occurrences of a word on the same line as well as multiple files within a given file. For example:
ABCCG*CAT*AD*CAT*TT
DFGBBB*CAT*YYUAB
Manually searching for the word 'CAT'
I found two when using /CAT
, when in fact there are three occurrences of that word in the file.
What is the command to find all occurrences of a given word in a file irrespective of the fact that it may occur multiple times within a line?
Note: There are no *
in the file. I have used it in the example above to denote the positions of the string CAT
.
What if the multiple occurrences were to overlap on the same line? For example:
ABCCG*TNTNT*ADCATDD
DFGBBB*TNT*YYUAB
Searching for the word TNT
using :%s/TNT//gn
would still give me 2, when in fact there are three occurrences.
Is there a way to identify overlapping occurrences in the same line using Vim?
Upvotes: 4
Views: 2940
Reputation: 88336
To get a count of the total number of all matches of an item—including ”overlapping” string cases, you actually need to use the %s
command (long form: %substitute
) and tell it three things:
n
flag; in this case, a mnemonic for “noop” I guess) g
flag for “global“)\{-}
; somewhat arcane but worth reading up on; see below)Putting all that together, here's what it looks like:
:%s/[T]\{-}NT//gn
So, given the following text from the question:
ABCCG*TNTNT*ADCATDD
DFGBBB*TNT*YYUAB
…vim will then report this:
3 matches on 2 lines
If/when you do actually want a count of just the number of matching lines, you can omit the g
and vim will use its default of reporting a count just for the number lines that contain a match. And if you don’t want to count “overlapping” strings, then omit the \{-}
part.
The vim docs actually have very good info about this stuff.
For more help on counting items in vim, see :help count-items
:
Counting words, lines, etc. count-items
To count how often any pattern occurs in the current buffer use the substitute
command and add the 'n' flag to avoid the substitution. The reported number
of substitutions is the number of items. Examples:
:%s/./&/gn characters
:%s/\i\+/&/gn words
:%s/^//n lines
:%s/the/&/gn "the" anywhere
:%s/\<the\>/&/gn "the" as a word
You might want to reset 'hlsearch' or do ":nohlsearch".
Add the 'e' flag if you don't want an error when there are no matches.
And for more help with doing “non-greedy“ matching, see :help non-greedy
:
non-greedy
If a "-" appears immediately after the "{", then a shortest match
first algorithm is used (see example below). In particular, "\{-}" is
the same as "*" but uses the shortest match first algorithm. BUT: A
match that starts earlier is preferred over a shorter match: "a\{-}b"
matches "aaab" in "xaaab".
Example matches
ab\{2,3}c "abbc" or "abbbc"
a\{5} "aaaaa"
ab\{2,}c "abbc", "abbbc", "abbbbc", etc.
ab\{,3}c "ac", "abc", "abbc" or "abbbc"
a[bc]\{3}d "abbbd", "abbcd", "acbcd", "acccd", etc.
a\(bc\)\{1,2}d "abcd" or "abcbcd"
a[bc]\{-}[cd] "abc" in "abcd"
a[bc]*[cd] "abcd" in "abcd"
The } may optionally be preceded with a backslash: \{n,m\}.
Upvotes: 4