andrux
andrux

Reputation: 2922

Extract substring using regexp in plain bash

I'm trying to extract the time from a string using bash, and I'm having a hard time figuring it out.

My string is like this:

US/Central - 10:26 PM (CST)

And I want to extract the 10:26 part.

Anybody knows of a way of doing this only with bash - without using sed, awk, etc?

Like, in PHP I would use - not the best way, but it works - something like:

preg_match( ""(\d{2}\:\d{2}) PM \(CST\)"", "US/Central - 10:26 PM (CST)", $matches );

Thanks for any help, even if the answer uses sed or awk

Upvotes: 160

Views: 284885

Answers (6)

Gilles Quénot
Gilles Quénot

Reputation: 185106

Using pure :

$ cat file.txt
US/Central - 10:26 PM (CST)
$ while read a b time x; do [[ $b == - ]] && echo $time; done < file.txt

with bash regex :

$ [[ "US/Central - 10:26 PM (CST)" =~ -[[:space:]]*([0-9]{2}:[0-9]{2}) ]] &&
    echo ${BASH_REMATCH[1]}

using grep and look-around advanced regex :

$ echo "US/Central - 10:26 PM (CST)" | grep -oP "\-\s+\K\d{2}:\d{2}"

using sed :

$ echo "US/Central - 10:26 PM (CST)" |
    sed 's/.*\- *\([0-9]\{2\}:[0-9]\{2\}\).*/\1/'

using Perl :

$ echo "US/Central - 10:26 PM (CST)" |
    perl -lne 'print $& if /\-\s+\K\d{2}:\d{2}/'

and last one using awk :

$ echo "US/Central - 10:26 PM (CST)" |
    awk '{for (i=0; i<=NF; i++){if ($i == "-"){print $(i+1);exit}}}'

Upvotes: 309

erwin
erwin

Reputation: 738

No need to open a pipe and spawn sed or awk to extract the 10:26 (time) part. Bash can easily handle this.

input="US/Central - 10:26 PM (CST)"
[[ $input =~ ([0-9]+:[0-9]+) ]]
echo ${BASH_REMATCH[1]}

Outputs:

10:26

If you're using zsh, it's the same, except the match result will be in $match[1] instead of $BASH_REMATCH[1]

In 2023, I don't think the extra pipe to grep, sed, awk or perl are relevant, especially when the question is:

Anybody knows of a way of doing this only with bash - without using sed, awk, etc?

Upvotes: 1

Jimbro
Jimbro

Reputation: 1

foo="US/Central - 10:26 PM (CST)"

echo ${foo} | date +%H:%M

Upvotes: -2

LeChatDeNansen
LeChatDeNansen

Reputation: 133

If your string is

foo="US/Central - 10:26 PM (CST)"

then

echo "${foo}" | cut -d ' ' -f3

will do the job.

Upvotes: 6

jgshawkey
jgshawkey

Reputation: 2122

    echo "US/Central - 10:26 PM (CST)" | sed -n "s/^.*-\s*\(\S*\).*$/\1/p"

-n      suppress printing
s       substitute
^.*     anything at the beginning
-       up until the dash
\s*     any space characters (any whitespace character)
\(      start capture group
\S*     any non-space characters
\)      end capture group
.*$     anything at the end
\1      substitute 1st capture group for everything on line
p       print it

Upvotes: 167

doubleDown
doubleDown

Reputation: 8398

Quick 'n dirty, regex-free, low-robustness chop-chop technique

string="US/Central - 10:26 PM (CST)"
etime="${string% [AP]M*}"
etime="${etime#* - }"

Upvotes: 41

Related Questions