Reputation: 125
I have a directory on my computer which contains an entire database I found online for my research. This database contains thousands of files, so to do what I need I've been looking into file i/o stuff. A programmer friend suggested using bash/awk. I've written my code:
#!/usr/bin/env awk
ls -l|awk'
BEGIN {print "Now running"}
{if(NR == 17 / $1 >= 0.4 / $1 <= 2.5)
{print $1 > wavelengths.txt;
print $2 > reflectance.txt;
print $3 > standardDev.txt;}}END{print "done"}'
When I put this into my console, I'm already in the directory of the files I need to access. The data I need begins on line 17 of EVERY file. The data looks like this:
some number some number some number
some number some number some number
. . .
. . .
. . .
I want to access the data when the first column has a value of 0.4 (or approximately) and get the information up until the first column has a value of approximately 2.5. The first column represents wavelengths. I want to verify they are all the same for each file later, so I copy them into a file. The second column represents reflectance and I want this to be a separate file because later I'll take this information and build a data matrix from it. And the third column is the standard deviation of the reflectance.
The problem I am having now is that when I run this code, I get the following error: No such file or directory
Please, if anyone can tell me why I might be getting this error, or can guide me as to how to write the code for what I am trying to do... I will be so grateful.
Upvotes: 3
Views: 1405
Reputation: 54402
Excellent attempt, but this is because you should never parse the output of ls
. Still, you were probably looking for ls -1
, not ls -l
. awk
can also accept a glob of files. For example, in the desired directory, you can run:
awk -f /path/to/script.awk *
Contents of script.awk
:
BEGIN {
print "Now running"
}
NR == 17 && $1 >= 0.4 && $1 <= 2.5 {
print $1 > "wavelengths.txt"
print $2 > "reflectance.txt"
print $3 > "standardDev.txt"
}
END {
print "Done"
}
Upvotes: 3
Reputation: 203532
The main problem is that you need to quote the names of the output file names as they are strings not variables. Use:
print $1 > "wavelengths.txt"
instead of:
print $1 > wavelengths.txt
Upvotes: 3