Reputation: 25
A list (without the .ABC and .DEF extentions!) of more than 100.000 files needs to be copied. At the moment I'm using while combined with find command in the /opt/project/ directory to generate the full PATH so I can copy them later.
while read LINE; do find opt/project/TOP3RST_0_/ -name "$LINE"*; done < < TOP3RST_0_file.list > PATH_TOP3RST_0_file.list
This process is going to slowly. I wonder if I can use awk, sed or something else to create the full PATH from the file list. Also if I can check if each file does exist would be a bonus.
From this:
BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002 BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002 BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002 BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002 BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002
Expected output PATH should be like this:
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002.DEF
Last I need the calculation of the gap in time:
BT_SUPR_TOP3RST_0__20200716T003457_20100716T004736_0002.ABC
20200716T003457 = 2020-07-16 00:34:57
20200716T004736 = 2020-07-16 00:47:36
I reckon something like datediff can calculate the gap?
Upvotes: 0
Views: 130
Reputation: 140970
The following sed
line may let you get started:
$ sed 's@.*__\([0-9]\{4\}\)\([0-9]\{2\}\)\([0-9]\{2\}\).*@/opt/project/TOP3RST_0_/\1/\2/\3/&/&@; s/.*/&.ABC\n&.DEF/' <<<'BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002'
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.DEF
The whole line is matched with the year, month and day saved with backreferences. Then the proper pattern is generated. With a second s
command two lines are outputted with different suffix. For learning regexes I recommend regex crosswords available on the net. This sed introduction is great, but here only s
command is used. FAQ: &
is the whole matched pattern and s
command may take any character as delimiter.
Upvotes: 1