Reputation: 27

unix delete rows from multiple files using input from another file

I have multiple (1086) files (.dat) and in each file I have 5 columns and 6384 lines. I have a single file named "info.txt" which contains 2 columns and 6883 lines. First column gives the line numbers (to delete in .dat files) and 2nd column gives a number.

etc... I need to read in info.txt, find every-line number corresponding to values less than 300 in 2nd column (so it is 2 and 3 in above example). Then I need to read these values into sed-awk or grep and delete these #lines from each .dat file. (So I will delete every 2nd and 3rd row of dat files in the above example).

More general form of the question would be (I suppose): How to read numbers as input from file, than assign them to the rows to be deleted from multiple files.

I am using bash but ksh help is also fine.

Upvotes: 0

Answers (4)

Vijay

Reputation: 67319

This should create a new dat files with oldname_new.dat but I havent tested:

awk 'FNR==NR{if($2<300)a[$1]=$1;next}
     !(FNR in a)
     {print >FILENAME"_new.dat"}' info.txt *.dat

Upvotes: 0

potong

Reputation: 58578

This might work for you (GNU sed):

sed -rn 's/^(\S+)\s*([1-9]|[1-9][0-9]|[12][0-9][0-9])$/\1d/p' info.txt | 
sed -i -f - *.dat

This builds a script of the lines to delete from the info.txt file and then applies it to the .dat files.

N.B. the regexp is for numbers ranging from 1 to 299 as per OP request.

Upvotes: 1

tripleee

Reputation: 189937

sed -i "$(awk '$2 < 300 { print $1 "d" }' info.txt)" *.dat

The Awk script creates a simple sed script to delete the selected lines; the script it run on all the *.dat files.

(If your sed lacks the -i option, you will need to write to a temporary file in a loop. On OSX and some *BSD you need -i "" with an empty argument.)

Upvotes: 1

NeronLeVelu

Reputation: 10039

# create action list
cat info.txt | while read LineRef Index
 do
   if [ ${Index} -lt 300 ]
    then
      ActionReq="${ActionReq};${Index} b
"
    fi
 done

# apply action on files
for EachFile in ( YourListSelectionOf.dat )
 do
   sed -i -n -e "${ActionReq}
p" ${EachFile}
 done

(not tested, no linux here). Limitation with sed about your request about line having the seconf value bigger than 300. A awk is more efficient in this operation. I use sed in second loop to avoid reading/writing each file for every line to delete. I think that the second loop could be avoided with a list of file directly given to sed in place of file by file

Upvotes: 0

unix delete rows from multiple files using input from another file

Answers (4)

Related Questions