An example:
case Foo:
...
break;
case Bar:
...
break;
case More: case Complex:
...
break:
...
I’d like to retrieve all the regex matches (the whole matching text, or even better, the part between \(
and \)
) of the regex case \([^:]*\):
, which should give (in a new new buffer) something like
Foo
Bar
More
Complex
...
Another example of use case would be the extraction of some parts, for instance, URLs of images from an HTML file.
Is there a simple way to graph all regex matches and put them in a buffer in Vim?
Note: It’s similar to the question “How to extract text matching a regex using Vim?”, however I’m also interested in removing the lines that don’t match, preferably without a huge or complex regex.
There is a general way of collecting pattern matches throughout a piece
of text. The technique takes advantage of the substitute with an
expression feature of the :substitute
command
(see :help sub-replace-\=
). The key idea is to use a substitution
enumerating all of the pattern matches to evaluate an expression storing
them without replacement.
First, let us consider saving the matches. In order to keep a sequence
of matching text fragments, it is convenient to use a list
(see :help List
). However, it is not possible to modify a list
straightforwardly, using the :let
command, since there is no way to
run Ex commands in expressions (including \=
substitute expressions).
Yet, we can call one of the functions that modify a list in place. For
example, the add()
function is designed to append a given item to the
specified list (see :help add()
).
Another problem is how to avoid text modifications while running
a substitution. One approach is to make the pattern always have
a zero-width match by prepending \ze
or by appending \zs
atoms to it
(see :help /\zs
, :help /\ze
). The pattern modified in this way
captures an empty string preceding or succeeding an occurrence of the
original pattern in text (such matches are called zero-width matches
in Vim; see :help /zero-width
). Then, if the replacement text is also
empty, substitution effectively changes nothing: it just replaces
a zero-width match with an empty string.
Since the add()
function, as well as the most of the list modifying
functions, returns the reference to the changed list, for our technique
to work, we need to somehow get an empty string from it. The simplest
way is to extract a sublist of zero length from it by specifying a range
of indices such that a starting index is greater than an ending one.
Combining the aforementioned ideas, we obtain the following Ex command.
:let t=[] | %s/\<case\s\+\(\w\+\):\zs/\=add(t,submatch(1))[1:0]/g
After its execution, all matches of the first subgroup are accumulated
in the list referenced by the variable t
, and can be used as is or
processed in some way. For instance, to paste contents of the list one
by one on separate lines in Insert mode, type
Ctrl+R
=t
Enter
To do the same in Normal mode, simply use the :put
command:
:pu=t