Reputation: 175
My intention is to read multiple files and record the data in the corresponding array. The array will be further processed then. example:
#!/bin/sh
for i in 0 1 2 3 4 5
do
set -A data$i
done
i=2
t=data$i
//read data from file into data2, data3 ...
read_file(){
ifile="/tmp/proc/test/2" //should loop all files in the folder, file 2, 3, 4, ..., save to data2, data3, ....
j=1
for i in $( awk -F ',' '{ print $1; }' "${ifile}" )
do
t[$j]=$i
echo "${t[$j]}"
j=$((j+1))
done
echo "done read_file"
}
//intention to process data2, data3..., but not sure how to do it.
process_data(){
input=$1
#echo ${!$1}
#for line in "${!input[@]}"
#do
#echo "test: "
#eval "echo \$$line"
#echo ${!line}
#done
#}
#process $t
}
Is it possible to do this with shell script (bash is ok as well)? Are there other ways to do it? Please kindly give suggestions. Thanks.
A little update about my target: I want to read files in a folder, mark all duplicates and count the occurrence of duplicates across files. Then for each file in the folder, give a summary of how many unique items, how many duplicate items and the times of the occurrence of the duplicates.(duplicates should only occur between files)
My original script reads all files in the folder, use awk to get all duplicates and occurrences (save to file of all_dup), then compare each file to all_dup (use awk), give each_file_dup, each_file_uniq.
I'm thinking to speed it up by writing data into array instead of files. But not sure if it actually works. The files vary from hundreds to thousands of lines (long int). Thanks for any suggestions.
Upvotes: 0
Views: 62
Reputation: 15273
Reading several files into one array is trivial.
arrayName=( $( cat file1 file2 $someGlob ) )
If the file names are also valid as array names you can use dynamic evaluation.
$: grep . ?
a:1
a:2
a:3
b:4
b:5
b:6
c:7
c:8
c:9
$: lst=(?)
$: set -x; for f in ${lst[@]}; do eval "$f=(\$(<$f))"; eval : "$f: \${${f}[@]}"; done; set +x
+ for f in ${lst[@]}
+ eval 'a=($(<a))'
++ a=($(<a))
+ eval : 'a: ${a[@]}'
++ : a: 1 2 3
+ for f in ${lst[@]}
+ eval 'b=($(<b))'
++ b=($(<b))
+ eval : 'b: ${b[@]}'
++ : b: 4 5 6
+ for f in ${lst[@]}
+ eval 'c=($(<c))'
++ c=($(<c))
+ eval : 'c: ${c[@]}'
++ : c: 7 8 9
Personally, I prefer a sourced temp file.
$: for f in ${lst[@]}; do echo "$f=(\$(<$f))">tmpfile; . tmpfile; done;
$: echo ${b[1]}
5
If the files are all have variable-valid names in /tmp/proc/test/
, then you might use something like this -
flist=( /tmp/proc/test/* )
for f in "${flist[@]}"
do echo "${f//[/]/_}=(\$(<$f))" >| tmpfile;
. tmpfile;
done
This will use the name _tmp_proc_test_2
the array for the file /tmp/proc/test/2
.
Upvotes: 2