user11549576
user11549576

Reputation: 159

extract string based on parenthesis using sed

I want to extract the string that is beside the parenthesis using sed. Example, following is the output of the /proc/mdstat and it has a failed drive which is denoted by (F).
cat /proc/mdstat | grep 'F' md2 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1]

I want to extract the drive nvme0n1p3 which has a suffix (F) next to it. I tried with awk and cut, but the position of the failed drives changes and in that case awk and cut will not work. Could anyone please help me on this as regular expressions seem to be complicated for me.

Upvotes: 2

Views: 60

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626754

You may use a single call to awk (note you do not need cat, just pass the file to awk directly):

awk -F'[][():[:space:]]+' '/\(F\)/{print $4}' /proc/mdstat

The -F'[][():[:space:]]+' sets the field delimiter to a regex that matches 1 or more ], [, (, ), : or whitespace chars. Thus, the fields you will get will be [md2, active, raid1, nvme0n1p3, 0, F, nvme1n1p3, 1], and as you see, your value is in Field 4.

I also added the check for parentheses around F, probably it will make the line search safer.

Online test:

s="some line 1
md2 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1]
last line"
awk -F'[][():[:space:]]+' '/\(F\)/{print $4}' <<< "$s"
# => nvme0n1p3

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521168

Here is a solution using sed in regex extended mode:

echo "md2 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1]" |
    sed -E 's/^.* (\S+)\[[[:digit:]]+\]\(F\).*$/\1/'

This outputs:

nvme0n1p3

Here is an explanation of the pattern being used:

^                     from the start of the input
    .*                consume anything, up until the last
    [ ]               single space, followed by
    (\S+)             series of non whitespace characters (captured)
    \[[[:digit:]]+\]  which are followed by '[0]', a number in square brackets
    \(F\)             followed by '(F)'
    .*                then consume the rest of the input
$                     end of input

Then we replace with \1, which is the captured (\S+) quantity from the regex pattern, which should be what you want to find.

Upvotes: 1

Related Questions