xavi
xavi

Reputation: 119

Bash Regex extract all text from 2nd occurence of specific character until end of line

I have the following strings:

text/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1

with a regular expression, I want to extract:

text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1

I have tried with regex101.com the following expression: ([^:]+)(?::[^:]+){1}$ and it worked (only for the first string)

But if I try in bash, it does not

echo "text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1" | sed -n "/([^:]+)(?::[^:]+){1}$/p"

Upvotes: 2

Views: 753

Answers (6)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2805

absolutely no need to use anything that requires regex-backreferences, since the regex anchoring is right at the line head anyway :

mawk ++NF OFS= FS='^[^:]*:[^:]*:' 
                       
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1

Upvotes: 0

Shawn
Shawn

Reputation: 52344

There's no reason to drag sed or other external programs into this; just use bash's built in regular expression matching:

#!/usr/bin/env bash

strings=(text/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
         text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1)

for s in "${strings[@]}"; do
    [[ $s =~ ^([^:]*:){2}(.*) ]] && printf "%s\n" "${BASH_REMATCH[2]}"
done

Heck, you don't need regular expressions in bash:

printf "%s\n" "${s#*:*:}"

Upvotes: 2

ufopilot
ufopilot

Reputation: 3975

awk

string='ext/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1'

awk -vFS=: -vOFS=: '{$1=$2="";gsub(/^::/,"")}1' <<<"$string"
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163267

Non capture groups (?: are not supported in sed and you have to escape the \( \) \{ \} and \+

You can repeat 2 occurrences of : from the start of the string and replace that with an empty string.

sed 's/^\([^:]\+:\)\{2\}//' file

Or using sed -E for extended regexp:

sed -E 's/^([^:]+:){2}//' file

Output

text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1

Upvotes: 4

anubhava
anubhava

Reputation: 785058

It would be much easier with cut without any regex:

cut -d: -f3- file

text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1

Upvotes: 4

sseLtaH
sseLtaH

Reputation: 11207

Using sed

$ sed s'|\([^:]*:\)\{2\}\(.*\)$|\2|' input_file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1

or

$ sed s'|\([^:]*:\)\{2\}||' input_file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1

Upvotes: 2

Related Questions