Pink
Pink

Reputation: 161

Perl forming string random string combination

I have a file with around 25000 records, each records has more than 13 entries are drug names. I want to form all the possible pair combination for these entries. Eg: if a line has three records A, B, C. I should form combinations as 1) A B 2) A C 3)B C. Below is the code I got from internet, it works only if a single line is assigned to an array:

use Math::Combinatorics;

my @n = qw(a b c);
my $combinat = Math::Combinatorics->new(
  count => 2,
  data  => [@n],
);

while ( my @combo = $combinat->next_combination ) {
  print join( ' ', @combo ) . "\n";
}

The code I am using, it doesn't produce any output:

open IN, "drugs.txt" or die "Cannot open the drug file";
open OUT, ">Combination.txt";

use Math::Combinatorics;

while (<IN>) {
  chomp $_;
  @Drugs = split /\t/, $_;
  @n = $Drugs[1];

  my $combinat = Math::Combinatorics->new(
    count => 2,
    data  => [@n],
  );

  while ( my @combo = $combinat->next_combination ) {

    print join( ' ', @combo ) . "\n";
  }
  print "\n";
}

Can you please suggest me a solution to this problem?

Upvotes: 2

Views: 339

Answers (3)

Greg Bacon
Greg Bacon

Reputation: 139471

All pairs from an array are straightforward to compute. Using drugs A, B, and C as from your question, you might think of them forming a square matrix.

AA  AB  AC
BA  BB  BC
CA  CB  CC

You probably do not want the “diagonal” pairs AA, BB, and CC. Note that the remaining elements are symmetrical. For example, element (0,1) is AB and (1,0) is BA. Here again, I assume these are the same and that you do not want duplicates.

To borrow a term from linear algebra, you want the upper triangle. Doing it this way eliminates duplicates by construction, assuming that each drug name on a given line is unique. An algorithm for this is below.

  1. Select in turn each drug q on the line. For each of these, perform steps 2 and 3.
  2. Beginning with the drug immediately following q and then for each drug r in the rest of the list, perform step 3.
  3. Record the pair (q, r).
  4. The recorded list is the list of all unique pairs.

In Perl, this looks like

#! /usr/bin/env perl

use strict;
use warnings;

sub pairs {
  my @a = @_;

  my @pairs;
  foreach my $i (0 .. $#a) {
    foreach my $j ($i+1 .. $#a) {
      push @pairs, [ @a[$i,$j] ];
    }
  }

  wantarray ? @pairs : \@pairs;
}

my $line = "Perlix\tScalaris\tHashagra\tNextium";
for (pairs split /\t/, $line) {
  print "@$_\n";
}

Output:

Perlix Scalaris
Perlix Hashagra
Perlix Nextium
Scalaris Hashagra
Scalaris Nextium
Hashagra Nextium

Upvotes: 1

David W.
David W.

Reputation: 107040

I've answered something like this before for someone else. For them, they had a question on how to combine a list of letters into all possible words.

Take a look at How Can I Generate a List of Words from a group of Letters Using Perl. In it, you'll see an example of using Math::Combinatorics from my answer and the correct answer that ikegami had. (He did something rather interesting with regular expressions).

I'm sure one of these will lead you to the answer you need. Maybe when I have more time, I'll flesh out an answer specifically for your question. I hope this link helps.

Upvotes: 0

Cebjyre
Cebjyre

Reputation: 6622

You're setting @n to be an array containing the second value of the @Drugs array, try just using data => \@Drugs in the Math::Combinatorics constructor.

Also, use strict; use warnings; blahblahblah.

Upvotes: 1

Related Questions