Reputation: 87
Given this string of paths, separated by spaces:
path/folderA/fileA1 path/folderA/subFolderA/fileA2 path/folderB/fileB1
I would like to get a string of paths, separated by spaces, only the ones starting with path/folderA/.
Output: path/folderA/fileA1 path/folderA/subFolderA/fileA2
Then remove any match of path/folderA/ from this string.
Final output: fileA1 subFolderA/fileA2
Could this be done with a single line?
Upvotes: 0
Views: 746
Reputation: 5975
With grep
.
echo " $str" | grep -oP '(?<=\spath/folderA/)\S+' | xargs
-P
enables the use of Perl regexp syntax and you can use (?<=pattern)
which is a positive look-behind assertion. Also -o
keeps only the matched part after that pattern, which is \S+
, a sequence of non-white-space characters (until we find the next space, tab, newline etc.)
Also grep
output is always separated by newlines, so you have to pipe to tr '\n' ' '
or xargs
or similar to get one line.
Edit: to match only the beginning of the path, I added \s
(one whitespace character) and feed the input as " $str"
. This seemed easier fix, because \b
matches /
also, and (^|\s)
throws grep: lookbehind assertion is not fixed length
. So testing with this is ok:
> echo "$str"
path/folderA/fileA1 path/folderA/subfolderA/fileA2 path/path/folderA/not
> echo " $str" | grep -owP '(?<=\spath/folderA/)\S+' | xargs
fileA1 subFolderA/fileA2
Upvotes: 1
Reputation: 84642
You can do it simply with awk
matching the last set of word characters in each field and outputting them, e.g.
awk '{for (i=1; i<=NF; i++) if ($i ~ /folderA/) { match($i,/\w+$/); print substr($i,RSTART,RLENGTH)}}' <<< $path_str
Example Use/Output
path_str="path/folderA/fileA1 path/folderA/subFolderA/fileA2 path/folderB/fileB1"
awk '{for (i=1; i<=NF; i++) if ($i ~ /folderA/) { match($i,/\w+$/); print substr($i,RSTART,RLENGTH)}}' <<< $path_str
fileA1
fileA2
You can adjust the output format as desired. If you want the output all on one line, or if you want to use command substitution to capture the output in a new array, it's up to you.
Using Bash Parameter Expansions
If you want to use parameter expansions with substring removal, you can use a simple loop and the expansion $(var##*/}
to remove everything up to the final '/'
from each path component, e.g.
path_str="path/folderA/fileA1 path/folderA/subFolderA/fileA2 path/folderB/fileB1"
for i in $path_str; do
[[ $i =~ folderA ]] && echo ${i##*/}
done
fileA1
fileA2
For your case the parameter expansion is likely the most efficient as it is a built-in to your shell and avoids spawning a subshell. However, if you had hundreds of thousands of components, I'd probably let awk
handle it then.
The set of POSIX compliant parameter expansions with substring removal are:
${var#pattern} Strip shortest match of pattern from front of $var
${var##pattern} Strip longest match of pattern from front of $var
${var%pattern} Strip shortest match of pattern from back of $var
${var%%pattern} Strip longest match of pattern from back of $var
Bash provides many, many more parameter expansion in addition to those provided by POSIX. Including everything from substring replacement to character case conversion.
Let me know if you have further questions.
Upvotes: 0
Reputation: 84443
If you're starting with a string you run the risk that embedded spaces, newlines, or other problematic characters can throw things off. That's why it's usually better to work with globs or null-terminated values.
That said, you can use various builtins and expansions to get the results you want from your given example. Note that you must escape your forward slashes properly or store them in a quoted string to avoid interfering with the expansion syntax. For example:
path_str="path/folderA/fileA1 path/folderA/subFolderA/fileA2 path/folderB/fileB1"
match_str="path/folderA/"
read -ra paths <<< "$path_str"
for i in "${!paths[@]}"; do
[[ ! "${paths[i]}" =~ $match_str ]] && unset paths[i]
done
echo "${paths[@]//$match_str}"
This will print:
fileA1 subFolderA/fileA2
Upvotes: 2