stack0114106
stack0114106

Reputation: 8711

find nearest key match with input value greater or equal to key and records unsorted

I'm having the below input file and trying to find out the nearest key match with input value greater or equal to key. It works when the input file is sorted.

Input file:

10,Line1
20,Line2
30,Line3
40,Line4
50,Line5
55,Line6
70,Line7
75,Line8
90,Line9
95,Line10
99,Line11

Code that I tried:

$ awk -F,  -v inp=85 ' NR==1 { dp=0 } {  dt=($1-inp); d=sqrt(dt*dt); if(d<=dp && inp >= $1 ) { rec=$0 } dp=d } END { print rec } ' source.txt
75,Line8

$ awk -F,  -v inp=55 ' NR==1 { dp=0 } {  dt=($1-inp); d=sqrt(dt*dt); if(d<=dp && inp >= $1 ) { rec=$0 } dp=d } END { print rec } ' source.txt
55,Line6

It works fine when the source.txt is sorted on the key column i.e first. But it gives incorrect results when the file is not sorted

$ shuf source.txt | awk -F,  -v inp=85 ' NR==1 { dp=0 } {  dt=($1-inp); d=sqrt(dt*dt); if(d<=dp && inp >= $1 ) { rec=$0 } dp=d } END { print rec } ' 
50,Line5   # Wrong

Can this be fixed for the unsorted file?.

Solutions using any unix tools are welcome!

Upvotes: 2

Views: 163

Answers (4)

zdim
zdim

Reputation: 66883

With Perl

perl -0777 -wnE' $in = shift // 85;
    %h = split /(?:\s*,\s*|\s*\n\s*)/; 
    END { --$in while not exists $h{$in}; say "$in, $h{$in}" }
' data.txt 57

Notes

  • read the whole file into a string ("slurp"), by -0777

  • populate a hash with file data; I strip surrounding spaces in the process

  • count down from input-value and check for such a key, until we get to one that exists

  • input is presumed integer and being in range

The nearest key is the first one that exists as input "clicks" down toward it an integer at a time.

The invocation above (for 57) prints the line:   55, Line6.


A version that does check the range of input and allows non-integer input

perl -MList::Util=min -0777 -wnE' $in = int shift // 85;
    %h = split /(?:\s*,\s*|\s*\n\s*)/; 
    die "Input $in out of range\n" if $in < min keys %h;
    END { --$in while not exists $h{$in}; say "$in, $h{$in}" }
' data.txt 57

Upvotes: 3

anubhava
anubhava

Reputation: 785058

You may use this awk:

awk -F, -v n=85 'n>=$1 && (max=="" || $1>max){max=$1; rec=$0} END{print rec}' file

75,Line8

Run this again with a different value:

awk -F, -v n=55 'n>=$1 && (max=="" || $1>max){max=$1; rec=$0} END{print rec}' file

55,Line6

Upvotes: 4

Polar Bear
Polar Bear

Reputation: 6798

Code for unsorted lines

use strict;
use warnings;

my $target = shift
    or die "Please enter a value";

my @lines = <DATA>;
my $line;
my %data;

map { my @array = split ',', $_; $data{$array[0]} = $_ } @lines;

foreach my $key ( sort keys %data ) {
    last if $key > $target;

    $line = $data{$key};
}

print $line;

__DATA__
10,Line1
20,Line2
30,Line3
40,Line4
50,Line5
55,Line6
70,Line7
75,Line8
90,Line9
95,Line10
99,Line11

Upvotes: 0

Polar Bear
Polar Bear

Reputation: 6798

Following code comply with your requirement

use strict;
use warnings;

my $target = shift
    or die "Please enter a value";

my $line;

while( <DATA> ) {
    my @data = split ',';

    last if $data[0] > $target;

    $line = $_;
}

print $line;

__DATA__
10,Line1
20,Line2
30,Line3
40,Line4
50,Line5
55,Line6
70,Line7
75,Line8
90,Line9
95,Line10
99,Line11

Upvotes: 2

Related Questions