Reputation: 3741
I want to find all different parameters passed to the __()
function in my project. So far the best grep
call I've constructed is this one:
find . -name "*.php" | xargs grep "__('.*')" -sioh
It successfully finds all calls to the __()
function, but it has the following problems:
__()
function call instead of only the parameterWhat I want is a list of all distinct parameters passed to the function, so I would like each parameter to be in its own line (no __(
at the beginning and no )
at the end).
For an example line that looks like this:
/* Some code */ __('foo'); /* Some more code */ __('bar'); /* Even more code */
My command returns the following result:
__('foo'); /* Some more code */ __('bar')
What I would like to get is this (in their distinct lines without quotes):
foo
bar
Edited:
As it turns out, the first argument is not always a single quoted string. Sometimes it's a variable (starting with a $
sign as it's PHP in question, and optionally having array indexes, e.g. $a['b']
).
And there are two more optional boolean arguments. But it's only the first argument I actually care about getting in the result, the other two are not important.
Upvotes: 3
Views: 187
Reputation: 439727
This answer assumes the following, in line with the OP's later clarification:
- __()
calls in the input data have 1-3 arguments, not necessarily single-quoted.
- Only the 1st argument should be extracted.
- The 1st argument itself contains neither ,
nor )
.
Try the following, which should work on most platforms:
find . -name "*.php" -exec grep -sioh "__([^,)]*" {} + | cut -c 4-
-exec
with +
ensures that as few invocations of grep
as possible are performed (in most cases, just 1); {}
is the placeholder for the matching filenames.grep
regex should be less greedy to ensure that multiple invocations on a line are captured; furthermore, since it's now clear that only the 1st argument should be extracted, [^,)]*
is used to capture only up to the next argument or the closing parenthesis. (Note that this could still fail if the 1st argument itself contains a comma or parenthesis).cut
command removes the unwanted parts from grep
's output (strips the __(
prefix).If your grep
implementation supports -R
(for recursive search) and --include
(to restrict files searched to those matching a glob), you can use
grep -R --include '*.php' -sioh "__([^,)]*" . | cut -c 4-
If your grep
implementation additionally supports -P
(PCREs: Perl-compatible regexes), use a modified version of anubhava's answer:
grep -R --include '*.php' -siohP "__\(\K[^,)]*"
Using -P
makes it easier to make the regex more robust by appending a lookahead assertion ((?=...)
) to ensure that the captured token is indeed followed by literal ,
or )
.
grep -R --include '*.php' -siohP "__\(\K[^,)]*(?=[,)])"
Finally, note how grep
with -P
requires \(
to match a literal (
, whereas the non-P
grep
commands above use basic regular expressions, where (
are not special and are treated as literals (there, you'd have to use \(
to make them special).
In grep implementations without -P
, invoking grep as egrep
or using -E
activates support for extended regular expressions, which have more features and are closer in syntax to PCREs, but are not as powerful.
A note on portability:
-P
(support for PCREs == Perl-Compatible Regular Expressions) is a GNU grep extension (won't work in BSD grep).-o
is an extension found in (at least) GNU grep and BSD grep.-R
and --include
are extensions found in (at least) GNU grep and BSD grep.Upvotes: 4
Reputation: 81012
This isn't as good as anubhava's answer but it is better and works for grep without PCRE flags.
Using [^)]*
instead of .*
in the match will stop matches at the end of the function instead of the end of the last function call on the line.
$ grep -sioh "__('[^)]*')" *.php
__('foo')
__('bar')
Upvotes: 1
Reputation: 785781
Use this grep -P
(PCRE):
grep -HoP '__\(\K[^)]*' *.php
file.php:'foo'
file.php:'bar'
It finds __\(
and \K
resets the matched data. [^)]*
then finds text before )
.
Upvotes: 2