jnorth
jnorth

Reputation: 115

search file line by line, use match variables for downstream application

The file I am working with looks like this (that I cut and paste from 2 columns):

1077551 c1ccc(cc1)n2c(nnc2SCC(=O)NC3CCCCC3)c4ccccc4O
1364513 CCn1c(nnc1SCC(=O)Nc2cccc(c2C)C)c3ccccc3N
1364529 CCn1c(nnc1SCC(=O)Nc2ccccc2Cl)c3ccccc3N
2270998 CC(C)(C)c1cc(c(c2c1nc(o2)c3ccccc3O)O)C(C)(C)C
2357441 C[C@@H]1CCc2c(sc3c2c(nc(n3)SCC(=O)Nc4ccccc4)N)C1

and my current code is:

file=./testin
while IFS= read -r line; do
    var1=$(grep -P '\d{6,8}');
    var2=$(grep -i -P '[A-Z].*');
    obabel -:"$var2" -o mol -O ./${var1%.*}.mol
done < "$file"

The idea is to match the number in the line and store as var1, then match the following string of characters (not sure how to effectively do this given it ends in either a letter or digit) and assign it var2. Following this, $var1 and $var2 are input into the "obabel" command where the output file is named after "var1".

Upvotes: 1

Views: 39

Answers (1)

randomir
randomir

Reputation: 18697

Note that read (POSIX-compatible) built-in will read a line from the standard input and split it into fields (word delimiters are given in IFS).

Assuming your columns are whitespace-separated (meaning you don't have to change the IFS) and you want to read the first field into var1 and the second field into var2, you can do it simply with:

#!/bin/sh

file=./testin
while read -r var1 var2 rest; do
    # var1/var2 are field1/field2, rest stores the remaining fields
    obabel -:"$var2" -o mol -O ./${var1%.*}.mol
done <"$file"

Upvotes: 1

Related Questions