Reputation: 51
how can I compare and output the "id" and "name" for which the start and end dates has overlap and falls under the same date range.
In the below example row 2 date range overlaps with row 1, similar way row 3 date range overlap with row 4.
For example:
row id name start end
1 AA123 temp1 2020-01-10 2020-04-10
2 AA123 temp1 2020-02-20 2020-03-20
3 AA700 temp4 2019-01-01 2019-02-28
4 AA700 temp4 2018-12-01 2019-04-20
5 BB120 temp5 2021-01-10 2021-02-01
Expected Output:
id name
AA123 temp1
AA700 temp4
Appreciate your help.
Upvotes: 0
Views: 48
Reputation: 67467
$ awk -v OFS='\t' 'NR==1 || (k=$2) in s && !(e[k]<$4 || $5<s[k]) {print k,$3}
{s[k]=$4; e[k]=$5}' file
id name
AA123 temp1
AA700 temp4
Upvotes: 1