Jé Queue
Jé Queue

Reputation: 10637

Extract float from text line with sed?

I am issuing a sed replace on a line, trying to extract a particular floating-point value, but all that appears to be matched is the right-side of the decimal

Text Line:

63.544: [GC 63.544: [DefNew: 575K->63K(576K), 0.0017902 secs]63.546: [Tenured: 1416K->1065K(1536K), 0.0492621 secs] 1922K->1065K(2112K), 0.0513331 secs]

If I issue s/^.*\([0-9]*\.[0-9]*\): \[Tenured:.*$/\1/, my output is:

.546

I'm wanting the 63.546 out of the line. Why is the very first [0-9]* not matching?

Upvotes: 3

Views: 9357

Answers (4)

Jeremy Stein
Jeremy Stein

Reputation: 19661

Also match the ] before the number you want:

s/^.*]\([0-9]*\.[0-9]*\): \[Tenured:.*$/\1/

Per comment below, here is a more generic approach, matching a non-digit first:

s/^.*[^0-9]\([0-9]*\.[0-9]*\): \[Tenured:.*$/\1/

Upvotes: 4

jheddings
jheddings

Reputation: 27563

As Stefano pointed out, the pattern is performing a greedy match at the beginning of your text input.

If you can use perl, this command works to match your line on standard input:

perl -e '<STDIN> =~ m/^.*?([\d]+\.[\d]+):\s+\[Ten/ && print "$1\n";'

Upvotes: 1

Stefano Borini
Stefano Borini

Reputation: 143785

My feeling is that your .* at the beginning is acting greedy, so it absorbs everything up to the dot, but I could be wrong.

Don't use sed. I gave up on this. perl is a better choice (I was starting to play with it) but the solution with awk beats me. Go for that, unless you really love sed for some particular reason...

Upvotes: 4

ghostdog74
ghostdog74

Reputation: 342363

use awk instead sed. why bother creating complex regex?

$ more file
63.544: [GC 63.544: [DefNew: 575K->63K(576K), 0.0017902 secs]63.546: [Tenured: 1416K->1065K(1536K), 0.0492621 secs] 1922K->1065K(2112K), 0.0513331 secs]

$ awk -vRS="]" -F":" '$1+0==$1{print $1}' file
63.544
63.546

Upvotes: 2

Related Questions