Reputation: 1078
I'm sorry if this question has been asked before, I just didn't know how to word it as a search query.
I have a set of folders that look like this:
Brain - Amygdala/ Brain - Spinal cord (cervical c-1)/ Skin - Sun Exposed (Lower leg)/
Brain - Caudate (basal ganglia)/ Lung/ Whole Blood/
I also have a set of files that look like this:
Brain_Amygdala.v7.covariates_output.txt Skin_Not_Sun_Exposed_Suprapubic.v7.covariates_output.txt
Brain_Caudate_basal_ganglia.v7.covariates_output.txt Skin_Sun_Exposed_Lower_leg.v7.covariates_output.txt
Brain_Spinal_cord_cervical_c-1.v7.covariates_output.txt Whole_Blood.v7.covariates_output.txt
As you can see, the files do not perfectly match up with the directories in their names. For example, Brain_Amygdala.v7.covariates_output.txt
is not totally identical to Brain - Amygdala/
. Even if we were to excise the tissue name from the covariates file, Brain_Amygdala
is formatted differently from its corresponding folder.
Same with Whole Blood/
. It is different from Whole_Blood.v7.covariates_output.txt
, even if you were to isolate the tissue name from the covariates file Whole_Blood
.
What I want to do, however, is to move each of these tissue files to their corresponding folder. If you notice, the covariate files are named after the tissue leading up to the first dot .
in the file name. They are separated by underscores _
. How I was thinking about approaching this was to break up the first few words leading up to the first .
of the file name so that I can easily move it to its corresponding file.
e.g.
Brain_Amygdala.v7.covariates_output.txt
-> Brain*Amygdala
[mv]-> Brain*Amygdala/
a) I'm not sure how to isolate the first words of a file name leading up to the first .
in a filename
b) if I were to do that, I don't know how to insert a wildcard in between each word and match that to the corresponding folder.
However, I am completely open to other ways of doing something like this.
Upvotes: 2
Views: 112
Reputation: 46823
Not a full answer, but it should address some of your concerns:
a) to isolate the first word of a string, leading up to the first .
: use Parameter Expansions
string=Brain_Amygdala.v7.covariates_output.txt
until_dot=${string%%.*}
echo "$until_dot"
will output Brain_Amygdala
(which we saved in the variable until_dot
).
b) You may want to use the ${parameter/pattern/string}
parameter expansion:
# Replace all non-alphabetic characters by the glob *
glob_pattern=${until_dot//[^[:alpha:]]/*}
echo "$glob_pattern"
will output (with the same variables as above) Brain*Amygdala
c) To use all of this: it's probably a good idea to determine the possible targets first, and do some basic checks:
# Use nullglob to have non matching glob expand to nothing
shopt -s nullglob
# DO NOT USE QUOTES IN THE FOLLOWING EXPANSION:
# the variable is actually a glob!
# Could also do dirs=( $glob_pattern*/ ) to check if directory
dirs=( $glob_pattern/ )
# Now check how many matches there are:
if ((${#dirs[@]} == 0)); then
echo >&2 "No matches for $glob_pattern"
elif ((${#dirs[@]} > 1)); then
echo >&2 "More than one matches for $glob_pattern: ${dirs[@]}"
else
echo "All good!"
# Remove the echo to actually perform the move
echo mv "$string" "${dirs[0]}"
fi
I don't know how your data will effectively conform to these, but I hope this answer actually answers some of your questions! (and to learn more about parameter expansions, do read — and experiment with — the link to the reference I gave you).
Upvotes: 2