Delete lines in a file based on first row

Question

I try to work on a whole series of txt files (actually .out, but behaves like a space delimited txt file). I want to delete certain lines in the text, based on the output compared to the first row.

So for example:

ID VAR1 VAR2
1 8 9
2 4 1
3 3 2

I want to delete all the lines with VAR1 < 0,5.

I found a way to do this manually in excel, but with 350+ files, this is going to be a long night, there are sure ways to do this more effective.. I worked on this set of files already in terminal (OSX).

malfunctioning · Accepted Answer

This is a typical job for awk, the venerable language for file manipulation.

What awk does is match each line in a file to a condition, and provide an action for it. It also allows for easy elementary parsing of line columns. In this case, you want to test whether the second column is less than 0.5, and if so not print that line. Otherwise, print the line (in effect this removes lines for which the variable is less than 0.5.

Your variable is in column 2, which in awk is referred to as $2. Each full line is referred to by the variable $0.

So you would do something like this:

{    if ($2 < 0.5) {
     }
     else {
     print $0
     }
}

Or something like that, I haven't used awk for a while. The above code is an awk script. Apply it on your file, and redirect the output to a new file (which will have all the lines not satisfying the condition removed).

Delete lines in a file based on first row

Answers (1)

Related Questions