Reputation: 3
this is my script
SourceFile='/root/Document/Source/'
FND=$(find $SourceFile. -regextype posix-regex -iregex "^.*/ABCDEF_555_[0-9]{5}\.txt$")
echo $FND
#*I've tried using "awk" but haven't gotten perfect results*
File Name:
ABCDEF_555_12345.txt
ABCDEF_555_54321.txt
ABCDEF_555_11223.txt
BEFORE
File Content from ABCDEF_555_12345.txt:
no|name|address|pos_code
1|rick|ABC|12342
2|rock|ABC|12342
3|Robert|DEF|54321
File Content from ABCDEF_555_54321.txt:
no|id|name|city
1|0101|RIZKI|JKT
2|0102|LALA|SMG
3|0302|ROY|YGY
i want to append a column that shows the file name in every row starting from the 2nd, and append a column with name_file to the first and i want to change the contents of the original files.
AFTER
file: ABCDEF_555_12345.txt
no|name|address|pos_code|name_file
1|rick|ABC|12342|ABCDEF_555_12345.txt
2|rock|ABC|12342|ABCDEF_555_12345.txt
3|Robert|DEF|54321|ABCDEF_555_12345.txt
file: ABCDEF_555_54321.txt
no|id|name|city|name_file
1|0101|RIZKI|JKT|ABCDEF_555_54321.txt
2|0102|LALA|SMG|ABCDEF_555_54321.txt
3|0302|ROY|YGY|ABCDEF_555_54321.txt
please give me light to find a solution :)) Thanks :))
Upvotes: 0
Views: 6152
Reputation: 3079
The best solution is to use awk.
If it's the first line (NR == 1
), print the line and append |name_file
.
For all other lines print the line and append the filename using the FILENAME
variable:
awk 'NR == 1 {print $0 "|name_file"; next;}{print $0 "|" FILENAME;}' foo.txt
You can either use it with multiple files:
find . -iname "*.txt" -print0 | xargs -0 awk '
NR == 1 {print $0 "|name_file"; next;}
FRN == 1 {next;} # Skip header of next files
{print $0 "|" FILENAME;}'
My first solution used to use the paste
command.
Paste allows you to concatenate files horizontally (compared to cat
which concatenates vertically).
To achieve the following with paste
, do:
head -n1 foo.txt
) with the column header (echo "name_file"
). The command paste
accept the -d
flag to define the separator between columns.tail -n+2 foo.txt
) and concatenate them with as many foo.txt
required (use a for
loop, computing the number of lines to fill.The solution looks like this:
paste -d'|' <(head -n1 foo.txt) <(echo "name_file")
paste -d'|' <(tail -n+2 foo.txt) <(for i in $(seq $(tail -n+2 foo.txt | wc -l)); do echo "foo.txt"; done)
no|name|address|pos_code|name_file
1|rick|ABC|12342|foo.txt
2|rock|ABC|12342|foo.txt
3|Robert|DEF|54321|foo.txt
However, the awk solution must be prefered because it is clearer (only one call, less process substitutions and co.), and faster.
$ wc -l foo.txt
100004 foo.txt
$ time ./awk.sh >/dev/null
./awk.sh > /dev/null 0,03s user 0,01s system 98% cpu 0,041 total
$ time ./paste.sh >/dev/null
./paste.sh > /dev/null 0,38s user 0,33s system 154% cpu 0,459 total
Upvotes: 3
Reputation: 4688
Using find
and GNU awk
:
My find
implementation doesn't have regextype posix-regex
and I used posix-extended
instead, but since you got the correct results it should be fine.
srcdir='/root/Document/Source/'
find "$srcdir" -regextype posix-regex -iregex ".*/ABCDEF_555_[0-9]{5}\.txt$"\
-exec awk -i inplace -v fname="{}" '
BEGIN{ OFS=FS="|"; sub(/.*\//, "", fname) } # set field separators / extract filename
{ $(NF+1)=NR==1 ? "name_file" : fname; print } # add header field / filename, print line
' {} \;
The pathname found by find
is passed to awk
in variable fname
. In the BEGIN
block the filename is extracted from the path.
The files are modified "inplace", make sure you make a backup of your files before running this.
Upvotes: 0