Bash substring with regular expression

In a bash script, I´d like to extract a variable string from a given string. I mean, i´d like to extract the string file.txt from the string:

This is the file.txt from my folder.

I tried:

var=$(echo "This is the file.txt from my folder.")
var=echo ${var##'This'}
...

but I´d like to make it in a cleaner way, using the expr, sed or awk commands.

Thanks

Edited:

I found another way (nevertheless, the answer with the sed command is the best one for me):

var=$(echo 'This is the file.txt from my folder.')
front=$(echo 'This is the ')
back=$(echo ' from my folder.')
var=${var##$front}
var=${var%$back} 
echo $var

Upvotes: 8

Views: 33659

Answers (5)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2915

using gawk :

gawk '_<($_ = RT)' RS='[^ /\0]+[.][^\0/\n ]+'

file.txt

"_" serves 2 different (implicit) purposes here :

  • left of <, it's an empty string ""
  • right of <, it's used as numeric zero, yielding $0

Upvotes: 0

erwin
erwin

Reputation: 748

No need to use sed or awk. Since 2004, bash has built in regex matching with the =~ operator.

input="This is the file.txt from my folder."
[[ $input =~ ([[:alnum:]]+\.[[:alnum:]]+) ]]
echo ${BASH_REMATCH[0]}

Output:

file.txt

If you're not comfortable writing Regular Expressions, it's easier to do interactively with regex101. For bash, use their default PCRE (perl compatible regular expressions) flavor.

enter image description here

Upvotes: 0

Daniel S.
Daniel S.

Reputation: 6650

The following solution uses sed with s/ (substitution) to remove the leading and trailing parts:

echo "This is the file.txt from my folder." | sed "s/^This is the \(.*\) from my folder.$/\1/"

Output:

file.txt

The \( and \) enclose the part which we want to keep. This is called a group. Because it's the first (and only) group which we use in this expression, it's group 1. We later reference this group inside of the replacement string with \1.

The ^ and $ signs make sure that the complete string is matched. This is only necessary for the special case that the filename contains either "from my folder." or "This is the".

Upvotes: 17

D&#225;niel
D&#225;niel

Reputation: 58

If 'file.txt' is a fixed string, and won't change, then you can do it like this:

var="This is the file.txt from my folder"

Notice that you don't need to echo the string to the variable, you just type it on the right side of the binary '=' operator.

echo $var |sed -e 's/^.*\(file\.txt\).*$/\1/'

Depending on your sed(1) version, you can loose the escaping of the parenthesis if you have the -r (extended regexp) option in sed(1).

If 'file.txt' changes, than you can create a pattern on a best effort basis, like:

echo $var |sed -e 's/^.* \([^ ]\+\.[^ ]\+\) .*$/\1/'

Upvotes: 1

EverythingRightPlace
EverythingRightPlace

Reputation: 1197

You could try grep:

var=$(egrep -o file.txt)

Upvotes: 1

Related Questions