Reputation: 49
I have a bunch of orders for NETFLIX in my brokerage account. I inadvertently entered two duplicate gtc Sell orders on 1/5 and 1/6. How do I detect it using a Perl script?
Buy NFLX 50 @ 315.00 Reg-Acct Fake
Buy NFLX 50 @ 317.50 Reg-Acct OPEN 01/13/15
Sell NFLX 50 @ 345.00 Reg-Acct OPEN 01/05/15
Sell NFLX 50 @ 345.00 Reg-Acct OPEN 01/06/15
Sell NFLX 50 @ 362.00 Reg-Acct OPEN 11/25/14
...
Sell NFLX 50 @ 345.00 IRA-Acct OPEN 09/15/14
I want the script to spit out just these two lines,
judged by fields[0]
through fields[6]
being identical.
Sell NFLX 50 @ 345.00 Reg-Acct OPEN 01/05/15
Sell NFLX 50 @ 345.00 Reg-Acct OPEN 01/06/15
I would prefer a simple script (i.e. no one-liner, no hash) as I am new to Perl.
Thanks, Larry
Upvotes: 0
Views: 102
Reputation: 48599
I would prefer a simple script (no hash)
Ugh. Missed the no hash. Unfortunately, simple and no hash are opposing goals--not to mention no hash means not efficient, i.e. slow. See code at bottom for how you should do it. In the meantime, you'll need parallel arrays:
use strict;
use warnings;
use 5.016;
use Data::Dumper;
my @orders;
my @counts;
my $fname = 'data3.txt';
open my $ORDERSFILE, '<', $fname
or die "Couldn't open $fname: $!";
LINE:
while (my $line = <$ORDERSFILE>) {
my @pieces = split ' ', $line;
my $date = pop @pieces;
my $order = join ' ', @pieces;
if (not @orders) { #then length of @orders is 0
$orders[0] = $order;
$counts[0] = 1;
next LINE;
}
for my $i (0..$#orders) {
if ($orders[$i] eq $order) {
$counts[$i]++;
next LINE;
}
}
#If execution reaches here, then the order wasn't found in the array...
my $i = $#counts + 1;
$orders[$i] = $order;
$counts[$i] = 1
}
say Dumper(\@orders);
say Dumper(\@counts);
for my $i (0..$#counts) {
if ($counts[$i] > 1) {
say "($counts[$i]) $orders[$i]";
}
}
--output:--
$VAR1 = [
'Buy NFLX 50 @ 315.00 Reg-Acct',
'Buy NFLX 50 @ 317.50 Reg-Acct OPEN',
'Sell NFLX 50 @ 345.00 Reg-Acct OPEN',
'Sell NFLX 50 @ 362.00 Reg-Acct OPEN',
'Sell NFLX 50 @ 345.00 IRA-Acct OPEN'
];
$VAR1 = [
1,
1,
2,
1,
1
];
(2) Sell NFLX 50 @ 345.00 Reg-Acct OPEN
Here are some better solutions:
use strict;
use warnings;
use 5.016;
use Data::Dumper;
my %dates_for; #A key will be an order; a value will be a reference to an array of dates.
while (my $line = <DATA>) {
my @pieces = split ' ', $line;
my $date = pop @pieces;
my $order = join ' ', @pieces;
push @{$dates_for{$order}}, $date; #autovivification (see explanation below)
}
say Dumper(\%dates_for);
my @dates;
for my $order (keys %dates_for) {
@dates = @{$dates_for{$order}};
my $dup_count = @dates;
if ($dup_count > 1) {
say "($dup_count) $order";
say " $_" for @dates;
}
}
__DATA__
Buy NFLX 50 @ 315.00 Reg-Acct Fake
Buy NFLX 50 @ 317.50 Reg-Acct OPEN 01/13/15
Sell NFLX 50 @ 345.00 Reg-Acct OPEN 01/05/15
Sell NFLX 50 @ 345.00 Reg-Acct OPEN 01/06/15
Sell NFLX 50 @ 362.00 Reg-Acct OPEN 11/25/14
Sell NFLX 50 @ 345.00 IRA-Acct OPEN 09/15/14
--output:--
$VAR1 = {
'Sell NFLX 50 @ 345.00 IRA-Acct OPEN' => [
'09/15/14'
],
'Sell NFLX 50 @ 345.00 Reg-Acct OPEN' => [
'01/05/15',
'01/06/15'
],
'Buy NFLX 50 @ 317.50 Reg-Acct OPEN' => [
'01/13/15'
],
'Buy NFLX 50 @ 315.00 Reg-Acct' => [
'Fake'
],
'Sell NFLX 50 @ 362.00 Reg-Acct OPEN' => [
'11/25/14'
]
};
(2) Sell NFLX 50 @ 345.00 Reg-Acct OPEN
01/05/15
01/06/15
When an undefined variable is dereferenced, it gets silently upgraded to an array or hash reference (depending of the type of the dereferencing). This behaviour is called autovivification and usually does what you mean (e.g. when you store a value)....
http://search.cpan.org/~vpit/autovivification-0.14/lib/autovivification.pm
For fixed width columns, it's more efficient to use unpack():
use strict;
use warnings;
use 5.016;
use Data::Dumper;
my $fname = 'data3.txt';
open my $ORDERSFILE, '<', $fname
or die "Couldn't open $fname: $!";
my %dates_for;
while (my $line = <$ORDERSFILE>) {
my ($order, $date) = unpack 'A41 @55 A*', $line; #see explanation below
push @{$dates_for{$order}}, $date;
}
close $ORDERSFILE;
say Dumper(\%dates_for);
my @dates;
for my $order (keys %dates_for) {
@dates = @{$dates_for{$order}};
if (@dates > 1) {
my $dup_count = @dates;
say "($dup_count) $order";
say " $_" for @dates;
}
}
--output:--
$VAR1 = {
' Buy NFLX 50 @ 317.50 Reg-Acct OPEN' => [
'01/13/15'
],
'Sell NFLX 50 @ 362.00 Reg-Acct OPEN' => [
'11/25/14'
],
'Sell NFLX 50 @ 345.00 Reg-Acct OPEN' => [
'01/05/15',
'01/06/15'
],
' Buy NFLX 50 @ 315.00 Reg-Acct Fake' => [
''
],
'Sell NFLX 50 @ 345.00 IRA-Acct OPEN' => [
'09/15/14'
]
};
(2) Sell NFLX 50 @ 345.00 Reg-Acct OPEN
01/05/15
01/06/15
A41 @55 A*
=> extract 41 characters(A),
..............................skip to position 55(@55),
..............................extract the remaining characters(A*)
You can skip to any position you want, forwards and backwards, which means you can extract pieces in any order you want.
Upvotes: 0
Reputation: 98388
I know you said no one-liner, but in case you just meant no perl one-liners:
sort filename|rev|uniq -D -f 1|rev
Upvotes: 1