Matthew Warman
Matthew Warman

Reputation: 3442

Print RegEx matches using SED in bash

I have an XML file, the file is made up of one line.

What I am trying to do is extract the "finalNumber" attribute value from the file via Putty. Rather than having to download a copy and search using notepad++.

I've built up a regular expression that I've tested on an On-line Tool, and tried using it within a sed command to duplicate grep functionality. The command runs but doesn't return anything.

RegEx:

(?<=finalNumber=")(.*?)(?=")

sed Command (returns nothing, expected 28, see file extract):

sed -n '/(?<=finalNumber=")(.*?)(?=")/p' file.xml

File Extract:

...argo:finalizedDate="2012-02-09T00:00:00.000Z" argo:finalNumber="28" argo:revenueMonth=""...

I feel like I am close (i could be wrong), am I on the right lines or is there better way to achieve the output?

Upvotes: 26

Views: 54322

Answers (5)

SwiftMango
SwiftMango

Reputation: 15284

Though you already select an answer, here is a way you can do in pure sed:

sed -n 's/^.*finalNumber="\([[:digit:]]\+\)".*$/\1/p' <test

Output:

28

This replaces the entire line by the match number and print (because p will print the entire line so you have to replace the entire line)

Upvotes: 13

potong
potong

Reputation: 58430

This might work for you (GNU sed):

sed -r 's/.*finalNumber="([^"]*)".*/\1/' file

Upvotes: 2

Perleone
Perleone

Reputation: 4038

Nothing wrong with good old grep here.

grep -E -o 'finalNumber="[0-9]+"' file.xml | grep -E -o '[0-9]+'

Use -E for extended regular expressions, and -o to print only the matching part.

Upvotes: 38

Alexey Shumkin
Alexey Shumkin

Reputation: 419

As I understand, there is no need to use look-aheads here. Try this one

sed -n '/finalNumber="[[:digit:]]\+"/p'

Upvotes: -1

choroba
choroba

Reputation: 241898

sed does not support look-ahead assertions. Perl does, though:

perl -ne 'print $1 if /(?<=finalNumber=")(.*?)(?=")/'

Upvotes: 1

Related Questions