gnome
gnome

Reputation: 593

sed: how to extract a part of a line containing colon

I want to extract the date part inside parentheses with sed:

# Equ_time  =  959240309.430000 (26-May-2015 07:38:29)

I use this code:

sed -n 's:# Equ_time  =  [0-9]*.[0-9]* .\([0-9]*.[A-Z]*.[0-9]*.[0-9]*.[0-9]*.[0-9]*\).*:\1:p' file.txt

but it returns:

26-May-2015 07

I think the problem is related to colon character in time section, How should I change the command to work fine?

Upvotes: 1

Views: 1457

Answers (3)

ilkkachu
ilkkachu

Reputation: 6527

Consider the marked parts in your input string and the pattern:

# Equ_time  =  959240309.430000 (26-May-2015 07:38:29)
                                    ^^^

.\([0-9]*.[A-Z]*.[0-9]*.[0-9]*.[0-9]*.[0-9]*\)
          ^^^^^

The character class [A-Z] will only match uppercase letters, not lowercase ones. (in the C locale, that is: in other locales it might match just about anything.) So that part will match exactly once (against the M), and the a and y will be matched by the following dots. Since you didn't spell out the actual separators (- and :), and used [0-9]* instead of [0-9]+ the rest of the pattern will match partially.

I would condense your sed to something like this, if you're not overly worried about matching the exact format:

sed -nEe 's/^# Equ_time *=[0-9. ]*\((.*)\).*/\1/p'

or, if you want the inner part matched up to a character, then

([0-9]+-[A-Za-z]+-[0-9]+ [0-9]+:[0-9]+:[0-9]+)

(note that I used extended regexes there)

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203665

If this isn't all you need:

$ awk -F'[ (]' '{print $8}' file
26-May-2015

then edit your question to clarify your requirements and provide more truly representative sample input. Right now you have a very complicated approach that's definitely unnecessary as you could do a more robust match more simply with:

$ sed -n 's:# Equ_time  =  [0-9]*\.[0-9]* .\([^ ]*\).*:\1:p' file
26-May-2015

but that's probably not even necessary either and without better sample input and the associated output it's hard to guess at what the right approach is.

Upvotes: 2

gaganshera
gaganshera

Reputation: 2639

Change it to

sed -n 's:# Equ_time  =  [0-9]*.[0-9]* .\([0-9]*.[A-Z]*.[0-9]*.[0-9]*.[0-9]*.[0-9:]*\).*:\1:p' file.txt

Notice I added a colon in the last [0-9:]*. The error was because you hadn't included a colon in your regex search

Upvotes: 2

Related Questions