Leon Filser
Leon Filser

Reputation: 191

Bash extract string with grep

Lets say i got multiple strings in one file and i only want to extract a specific string

$plugin->component = 'mod_jitsi';
$plugin->component = 'local_hvp';
$plugin->component = 'test_bot';
$plugin->component = 'mod_bot';
$plugin->component = 'mod_moodle';

I want to filter this with grep so my output looks like this:

mod
local
test
mod
mod

Is there any way to do this with grep or do i need to use awk or sed?

Thanks in advance!

Upvotes: 0

Views: 100

Answers (4)

Vishal
Vishal

Reputation: 44

echo '$plugin->component = 'mod_jitsi';
$plugin->component = 'local_hvp';
$plugin->component = 'test_bot';
$plugin->component = 'mod_bot';
$plugin->component = 'mod_moodle';' > STRING

awk 'BEGIN { FS ="=" } ; { print $2 }' STRING | cut -d "_" -f1



mod
local
test
mod
mod

Upvotes: 1

user5683823
user5683823

Reputation:

PCRE is probably the best way, and you already have two answers that demonstrate how to use it.

There is, however, a more "elementary" way - using just BRE (basic regular expressions, used by plain grep). You just need to call it twice.

I assume that each input row has (at most) one substring consisting of a single quote, followed by zero or more non-single-quote, non-underscore characters, followed by an underscore, and you must extract the sequence of non-single-quote, non-underscore characters within this substring.

If the input strings are in a file my_file:

[mathguy@localhost ~/test]$ more my_file
$plugin->component = 'mod_jitsi';
$plugin->component = 'local_hvp';
$plugin->component = 'test_bot';
$plugin->component = 'mod_bot';
$plugin->component = 'mod_moodle';


[mathguy@localhost ~/test]$ grep -o "'[^'_]*_" my_file | grep -o "[^'_]*"
mod
local
test
mod
mod

Upvotes: 0

Freddy
Freddy

Reputation: 4688

If your grep supports Perl-compatible regular expressions (PCRE):

grep -Po '\$plugin->component = '\''\K[^_]+' file

With sed:

sed -En 's/\$plugin->component = '\''([^_]+).*/\1/p' file

Upvotes: 0

Shawn
Shawn

Reputation: 52334

Using GNU grep and pcre regular expressions:

grep -Po "(?<== ')[^_]*" input.txt

(?<== ') is a zero-width positive lookbehind assertion. It's not included in the matched text, but it must match the = ' before the part of the RE that is included (Which is everything from after the quote to the first underscore.

Upvotes: 1

Related Questions