captcha
captcha

Reputation: 3756

Search pattern from a file into another file

I have file1 with numbers in four columns, each has 16 digits:

5150782703810496 7071783126930570 9776701040412294 4414890272061604
6426318539518073 5261792065926013 6303463168130122 8332314317620078
7506133106243886 2242241367012197 8275982207923757 7263931623813806
8882831187329643 3184441663826305 1416431572523093 0697142167966828

In file2 I have 16 search patterns (sorted here by chance), one for each number in file1:

0412294
062438
118732964
157252
17831269305
23813806
24224136701
3323143
381049
441489027206160
441663826305
5926013
66828
68130
82207923
8539518073

Now I'm searching for a solution to find row and column for each pattern from file2 in file1. Desired result in file3:

1,1=381049
1,2=17831269305
1,3=0412294
1,4=441489027206160
2,1=8539518073
2,2=5926013
2,3=68130
2,4=3323143
3,1=062438
3,2=24224136701
3,3=82207923
3,4=23813806
4,1=118732964
4,2=441663826305
4,3=157252
4,4=66828

I tried it with grep -f file2 file1 and found the row but no column. I'm on Windows and would prefer awk, grep or sed and can not use Perl and Bash unfortunately. How to achieve this? Thank you!

Upvotes: 3

Views: 182

Answers (4)

potong
potong

Reputation: 58371

This might work for you (GNU sed):

sed 's|.*|s/(.*=).*(&).*/\\1\\2/p|' file2 |
sed -nrf - <(sed = file1 | sed -r 'N;s/^(.*)\n(\S+)\s(\S+)\s(\S+)\s(\S+)/\1,1=\2\n\1,2=\3\n\1,3=\4\n\1,4=\5/') >file3

Transform file1 into a file with one set of numbers per line and add the line number and row. From file2 create a sed script and run it against the above file ouput. However since you are on windows I guess you will need to break each step into a separate file and run it in three steps.

Upvotes: 4

jaypal singh
jaypal singh

Reputation: 77085

Output is not sorted. Birei's solution does exactly what you need.

awk '
NR==FNR {
    for (i=1;i<=NF;i++) {
        a[$i]=NR","i 
    }
    next
} 
{ 
    b[$1] 
} 
END {
    for (x in a) { 
        for (y in b) {
            if (index(x,y)>0) {
                print a[x]"="y
            }
        }
    }
}' file1 file2

Output:

4,4=66828
4,1=118732964
3,3=82207923
4,3=157252
4,2=441663826305
2,4=3323143
1,1=381049
2,1=8539518073
3,2=24224136701
1,3=0412294
2,3=68130
1,2=17831269305
2,2=5926013
1,4=441489027206160
3,1=062438
3,4=23813806

Upvotes: 4

Birei
Birei

Reputation: 36252

An solution.

Content of script.awk:

FNR == NR {
    patterns[ $1 ] = 1 
    next
}

{
    for ( i = 1; i <= NF; i++ ) { 
        for ( p in patterns ) { 
            if ( index( $i, p ) > 0 ) { 
                printf "%d,%d=%s\n", FNR, i, p
                delete patterns[ p ] 
                break
            }   
        }   
    }   
}

Run it like:

awk -f script.awk file2 file1

That yields:

1,1=381049
1,2=17831269305
1,3=0412294
1,4=441489027206160
2,1=8539518073
2,2=5926013
2,3=68130
2,4=3323143
3,1=062438
3,2=24224136701
3,3=82207923
3,4=23813806
4,1=118732964
4,2=441663826305
4,3=157252
4,4=66828

Upvotes: 3

cforbish
cforbish

Reputation: 8819

You can create a bash script (you did not rule bash out) like:

IFS=$'\n'
lnum=0
for line in $(cat file1); do
    lnum=$(( lnum + 1 ))
    cnum=0
    IFS=' '
    for entry in $line; do
        cnum=$(( cnum + 1 ))
        IFS=$'\n'
        for pattern in $(cat file2); do
            if [[ $entry =~ ^.*${pattern}.*$ ]]; then
                echo "${lnum},${cnum}=${pattern}"
                break
            fi
        done
    done
done

Upvotes: 1

Related Questions