Greg Kuhn
Greg Kuhn

Reputation: 35

How to filter text with several parentheses in bash?

I have a bash script that creates a text file and then manipulates it with sed commands. However, on occasion there is a line which contains multiple parentheses.

For example:

fileInfo:    (2014) (b2b) (analog) (digital) (some-text)

This line could be as few a 1 set of () but usually at least 2. In the end I am only interested in extracting the last set of ()

fileInfo:    (some-text)

I can get it to work if there is a set number of (), but not when it varies from each file.

Until I encountered a file that had more than 2 sets of () the following has worked:

if grep -q "textInfo:   (.*) (.*)" "$TXT"; then
  SG=`egrep textInfo "$TXT" | sed "s/.*) (//"| sed "s/)$//"`
else
  SG=`egrep textInfo "$TXT" | sed "s/.* (//"| sed "s/)$//"`
fi

Upvotes: 0

Views: 299

Answers (5)

potong
potong

Reputation: 58391

This might work for you (GNU sed):

sed 's/:.*(/:(/' file

Delete everything from : to the last ( and then replace the : and (.

N.B. .* is greedy and always aims for the longest match.

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174706

Try this gnu sed command,

sed -r 's/^([^ ]+)( )+.*\((.*)\)/\1\2(\3)/g' file

Example:

$ echo 'fileInfo: (2014) (b2b) (analog) (digital) (some-text)' | sed -r 's/^([^ ]+)( )+.*\((.*)\)/\1\2(\3)/g'
fileInfo: (some-text)
  • ^([^ ]+) -Matches and stores one or more characters which is not to be a space and stored it into the first group.(Once it finds a space, sed stops fetching the characters)

  • ( )+ - Matches one or more space characters and stored it into the second group.Once the sed finds a character which is not to space character, it suddenly stops fetching.

  • .*\( - Matches any character upto the literal (. Normally sed matches the last (, if a line contains more than one (.

  • (.*)\) - Fetches the characters inside the last () brackets and stored it into the third group.

  • \1\2(\3) - Finally using back reference, sed replaces the whole line with these fetched groups.

Upvotes: 1

jaypal singh
jaypal singh

Reputation: 77095

Using sed:

$ sed -r 's/([^ ]+ +).*(\(.*)/\1 \2/' file
fileInfo:     (some-text)

Upvotes: 0

anubhava
anubhava

Reputation: 785118

Using BASH regex:

s='fileInfo:    (2014) (b2b) (analog) (digital) (some-text)'
[[ "$s" =~ ^([^:]+:).*(\([^()]*\))[^()]*$ ]] && echo "${BASH_REMATCH[1]} ${BASH_REMATCH[2]}"
fileInfo: (some-text)

Upvotes: 1

keiv.fly
keiv.fly

Reputation: 4015

Regular expressions can do this

I am not an expert in sed but probably this code catches the text in last paranthesis. You only need to add the other fixed text that you need.

sed -n '/\(([^)]+)\)$/p'

Upvotes: 1

Related Questions