nick2k3
nick2k3

Reputation: 1429

How to extract text matching a regex using Vim?

I would like to extract some data from a piece of text with Vim. The input looks like so:

72" title="(168,72)" onmouseover="posizione('(168,72)');" onmouseout="posizione('(-,-)');">>
72" title="(180,72)" onmouseover="posizione('(180,72)');" onmouseout="posizione('(-,-)');">>
72" title="(192,72)" onmouseover="posizione('(192,72)');" onmouseout="posizione('(-,-)');">>
72" title="(204,72)" onmouseover="posizione('(204,72)');" onmouseout="posizione('(-,-)');">>

The data I need to extract is contained in the title="(168,72)" portions of the input. In particular, I am interested in extracting coordinate pairs in parentheses.

I thought about using Vim to first delete everything before title=", but I am not really a regex guru, so I am asking you. If anyone has any hint, please let me know! :)

Upvotes: 7

Views: 4911

Answers (4)

Magnun Leno
Magnun Leno

Reputation: 2738

This task can be achieved with a much simpler solution and with few keystrokes using normal command:

:%normal df(f)D

This means:

  1. % - Run normal command on all file lines;
  2. normal - run the following commands in normal mode;
  3. df( - delete everything until you find a parenthesis (parenthesis included);
  4. f) - move the cursor to );
  5. D - delete everything until the end of the line.

You can also set a range, for example, run this from line 5 to 10:

:5,10normal df(f)D

Upvotes: 4

Prince Goulash
Prince Goulash

Reputation: 15715

If you want an ad hoc solution for this one-off case, it might be quicker simply to select a visual block using CTRL-v. This will let you select an arbitrary column of text (in your case, the column containing title="(X,Y)"), which can then be copied as usual using y.

Upvotes: 3

you can match everything inside title=() and discard everything else like this:

:%s,.*title="(\(.*\))".*,\1,

Upvotes: 1

MrWednesday
MrWednesday

Reputation: 419

This will replace each line with a tab-delimited list of coordinates per line:

:%s/.* title="(\(\d\+\),\(\d\+\))".*/\1\t\2

Upvotes: 5

Related Questions