Caren
Caren

Reputation:

Matching up MySQL tables using Perl

I have 2 MYSQL tables viz. main_table and query1. main_table contains the columns position and chr whilst query1 contains position, chr and symbol. The table query1 is derived by querying the main_table. I am wanting to match up both these tables using Perl such that the output would have the entire list of positions from the main_table in the first column and 2nd column would be symbols corresponding to that position. There could be no symbols at all or just one symbol or multiple symbols for each positions.

I am not very certain how to write the code up for this, currently I have

#!/usr/bin/perl

use strict;
use DBI;

my %ucsc;

my $dbh  = DBI->connect('DBI:mysql:disc1pathway;user=home;password=home');
my $dbs  = DBI->connect('DBI:mysql:results;user=home;password=home');
my $main = $dbh->prepare("select chr, position from main_table");        
my $q1   = $dbs->prepare("select position, symbol, chrom from query1");

$main->execute();
$q1->execute();    

while (my $main_ref = $main->fetchrow_hashref()) {
    $ucsc{$main_ref->{chr}}{$main_ref->{position}} = 1;
}

while (my $gene_ref = $q1->fetchrow_hashref()) {
    my $q1position = $gene_ref->{position};
    my $q1symbol   = $gene_ref->{symbol};
    my $q1chr      = $gene_ref->{chr};

    foreach my $ucsc (keys %{$ucsc{$q1chr}}) {
        print "$ucsc $q1symbol\n";
    }
}

$dbh->disconnect();
$dbs->disconnect();   

exit (0);

The following are examples of the of the main_table and query1. The desired output is what I am expecting and I worked it out using the VLOOKUP function in excel.

main_table              
CHR Position        
chr1    229830537       
chr1    229723373           
chr1    229723385           
chr1    229723393           
chr1    229723420           
chr1    229829627       
chr1    229723430           
chr1    229829926       
chr1    229723483           
chr1    229723490           
chr1    229723499           
chr1    229723501           
chr1    229830343       
chr1    229723534           
chr1    229723540           
chr1    230039934       
chr1    229723576           
chr1    229830537       
chr1    229830469           
chr1    229725982           
chr1    229726209       
chr1    229966154       
chr1    229726439           
chr1    229726726           
chr1    229726755           
chr1    229726973       
chr1    229967564       
chr1    229727249           
chr1    229727408           
chr1    229727612           
chr1    229728018           
chr1    229728050           
chr1    229728435                           
chr1    229728513                           
chr1    229966327                           

Query1              
symbol  CHR Position        
C1  chr1    229829230       
C1  chr1    229829278           
C1  chr1    229829442       
C1  chr1    229829627       
C1  chr1    229829653       
C1  chr1    229829683       
C1  chr1    229829810           
C1  chr1    229829926       
C1  chr1    229829961           
C1  chr1    229830085           
C1  chr1    229830086           
C1  chr1    229830087           
C1  chr1    229830088       
C1  chr1    229830141           
C1  chr1    229830343       
C1  chr1    229830469       
C1  chr1    229830534       
C1  chr1    229830537       
C2  chr1    230039932       
C2  chr1    230039934           
C2  chr1    230039939       
C2  chr1    230039944       
457 chr1    229966154           
457 chr1    229966327       
457 chr1    229966500           
457 chr1    229966552           
457 chr1    229966748       
457 chr1    229966998           
457 chr1    229967327           
457 chr1    229967564           
457 chr1    229967594           
457 chr1    229829627       



Desired Output          
Position    symbol      
229830537   C1      
229723373           
229723385           
229723393           
229723420           
229829627   C1, 457     
229723430           
229829926   C1      
229723483           
229723490           
229723499           
229723501           
229830343   C1      
229723534           
229723540           
230039934   c2      
229723576           
229830537   C1      
229830469           
229725982           
229726209           
229966154   457     
229726439           
229726726           
229726755           
229726973           
229967564   457     
229727249           
229727408           
229727612           
229728018           
229728050           
229728435           
229728513           
229966327           

Thanks in advance

Caren

Upvotes: 0

Views: 263

Answers (4)

Chas. Owens
Chas. Owens

Reputation: 64919

Weegee has the right answer, you can specify the location of a table like this: ipaddress.database.table. If you are on the same machine you can drop the ipaddress portion, and if you are in the same database you can drop the database portion. So your code should wind up looking like:

#!/usr/bin/perl

use strict;
use warnings;

use DBI;

my $dbh = DBI->connect(
    'DBI:mysql:disc1pathway',
    "home",
    "home",
    {
        ChopBlanks       => 1,
        AutoCommit       => 1,
        PrintError       => 0,
        RaiseError       => 1,
        FetchHashKeyName => 'NAME_lc',
    }
) or die "could not connect to database: ", DBI->errstr;

my $sth = $dbh->prepare("
    SELECT
        disc1pathway.main_table.chr,
        disc1pathway.main.position,
        results.query1.symbol,
        results.query1.chrom
    FROM disc1pathway.main_table, results.query1
    JOIN results.query1 ON (
        disc1pathway.main_table.position = results.query1.position
    )
");        

$sth->execute;

while (my $col = $sth->fetchrow_hashref) {
    print join(" ", @{$col}{qw/chr position symbol chrom/}), "\n";        
}

$sth->finish;

$dbh->disconnect;

Upvotes: 0

Caren
Caren

Reputation:

use strict;
use DBI;

my %ucsc;

my $dbh  = DBI->connect('DBI:mysql:disc1pathway;user=home;password=home');
my $dbs  = DBI->connect('DBI:mysql:results;user=home;password=home');

my $main = $dbh->prepare("select chr, position from main_table");
$main->execute();

my $q1 = $dbs->prepare("select position, symbol, chrom from query1");
$q1->execute();


while (my $main_ref = $main->fetchrow_hashref()) {
    $ucsc{$main_ref->{chr}}{$main_ref->{position}} = 1;
}

while (my $gene_ref = $q1->fetchrow_hashref()) {
    my $q1position = $gene_ref->{position};
    my $q1symbol   = $gene_ref->{symbol};
    my $q1chr      = $gene_ref->{chr};

    foreach my $ucsc (keys %{$ucsc{$q1chr}}) {
        print "$ucsc $q1symbol\n";
    }
}

$dbh->disconnect();
$dbs->disconnect();   

exit (0);

=====================================================================================

The above code just lists the position and the symbol, but does not match them up. I cant seem to get my head around on how to match them up. Any suggestions.

Thanks. Caren

Upvotes: 0

Weegee
Weegee

Reputation: 2273

It sounds like you need to do a join operation in your SQL query, but you'll need some kind of relationship in order for this to work properly. You might be able to figure out what you need using the MySQL reference manual's section on JOIN syntax.

On the Perl side you'll need to write the logic for your output. I would recommend making a hash, using the "position" as your key and then any symbols as values. Fill the hash first, then do your output. It would simplify your process for outputting your query the way you would like.

Upvotes: 1

David Webb
David Webb

Reputation: 193696

IF you have all the data already and you're just wondering how to output it in columns you should look at sprintf and printf which allow you to format output strings.

Upvotes: 0

Related Questions