Word extraction with regex string

Question

From this post, I am able recognize the pattern object.* by use or regex string m/(?<=object\.)\w*. However, since I am unfamiliar with Linux, I cannot use the commands sed or perl properly to extract desired tokens. Thus, I need your help. My best guess is grep -E -n object file.txt | perl -nle 'm/(?<=object\.)\w*/; print $1'.

Wiktor Stribiżew · Accepted Answer

You can use grep or sed:

grep -oP '(?<=object\.)\w+' file
sed -nE 's/.*object\.([[:alnum:]_]+).*/\1/p' file

See the online demo.

The grep -oP allows you to use PCRE regex (with -P option) and extract all matched texts (with -o option).

The sed command is more complex, it allows extracting matches (that are the last on a line) once per line: first, it suppresses the default line output with -n and sets the regex flavor to POSIX ERE (with -E), then matches a line with object. + one or more alphanumeric or underscore chars captured into \1 and replaces the full line with the Group 1 value, and only that result is returned.

Word extraction with regex string

Answers (2)

Related Questions