Reputation: 1572
I really don't know if awk would be the appropriate tool for that task... Maybe something in python would be better. Anyway, I thought asking here first for the feasibility of the task. Here we go :
Datas :
###offspr84 175177 200172 312312 310326 338342 252240 226210 113129 223264
male28 197175 172200 308312 310338 262338 256252 190226 113129 223219
female13 197177 172172 312308 318326 342350 240248 210218 129113 267247
offspr85 181177 192160 320312 290362 358330 238238 214178 133129 263223
male65 197181 176192 320268 322286 358330 238244 206214 137133 267263
female17 181177 160172 280312 362346 350326 230238 126178 129129 223167
So basicaly I need to print the first field ($1) and matching (in bold) $9 in the first record and matching $2 and $6 in second record.
Output file :
offspr84 113129
male28 113129
offspr85 181177
female17 181177
offspr85 358330
male65 358330
Any hint on how I could accomplish that ?
Thanx !
Upvotes: 0
Views: 571
Reputation: 246764
awk '
/^offspr/ {
for (i=1; i<=NF; i++) {
offspr[i] = $i
}
next
}
{
for (i=2; i<=NF; i++) {
if ($i == offspr[i]) {
print offspr[1] " " offspr[i]
print $1 " " $i
print ""
break
}
}
}
'
Upvotes: 0
Reputation: 134
This code will produce the output you want. Maybe not the best way, but seems to work as expected.
#data = [
#'offspr84 175177 200172 312312 310326 338342 252240 226210 113129 223264',
#'male28 197175 172200 308312 310338 262338 256252 190226 113129 223219',
#'female13 197177 172172 312308 318326 342350 240248 210218 129113 267247']
data = [
'offspr85 181177 192160 320312 290362 358330 238238 214178 133129 263223',
'male65 197181 176192 320268 322286 358330 238244 206214 137133 267263',
'female17 181177 160172 280312 362346 350326 230238 126178 129129 223167' ]
for i, line in enumerate(data):
data[i] = line.split(' ')
for item in data[0]:
if data[1].count(item) > 0:
print data[0][0], item
print data[1][0], item
if data[2].count(item) > 0:
print data[0][0], item
print data[2][0], item
Update:
With a nested list to include both list at once:
datas = [[
'offspr85 181177 192160 320312 290362 358330 238238 214178 133129 263223',
'male65 197181 176192 320268 322286 358330 238244 206214 137133 267263',
'female17 181177 160172 280312 362346 350326 230238 126178 129129 223167' ],
[
'offspr84 175177 200172 312312 310326 338342 252240 226210 113129 223264',
'male28 197175 172200 308312 310338 262338 256252 190226 113129 223219',
'female13 197177 172172 312308 318326 342350 240248 210218 129113 267247']
]
for data in datas:
for i, line in enumerate(data):
data[i] = line.split(' ')
for data in datas:
for item in data[0]:
if data[1].count(item) > 0:
print data[0][0], item
print data[1][0], item
if data[2].count(item) > 0:
print data[0][0], item
print data[2][0], item
Upvotes: 1
Reputation: 195039
try this awk code
awk '/###/{i++;next}
i==1{if($0~/offspr84/){
a=$9;n=$1;next;
}
if($9==a){print n,a;print $1,$9}}
i==2{if($0~/offspr85/){
m=$1;p=$2;q=$6;next;}
if($2==p){print m,p;print $1,p}
if($6==q){print m,q;print $1,q}
}' yourFile
Upvotes: 0
Reputation: 113
I'm not entirely sure on how you want the matching to work. but assuming the same pattern is applied to all fields, you can easily do this by looping over the fields e.g
{
for(i=2; i<=NF; i++) {
if (match($i, "some regexp")) {
print $1 $i
}
}
}
Upvotes: 0