Charles
Charles

Reputation: 377

How can I join tables using information from different rows?

I have two similar tables that I would like to join. See reproducible example below.

WHAT NEEDS TO BE DONE

See comments in code: concatenating the values '2021-01-01'(column: Date), 'hat'(column: content), 'cat'(column: content) and 'A'(column: Tote) in first_table would lead to a unique key that can be joined with the exact same data in second_table. The result would be the first row of the 4 unique events (see desired_result: '#first tote'). In reality the rows would be a few million.

Reproducible example:

CREATE OR REPLACE TABLE
`first_table` (
  `Date` string NOT NULL,
  `TotearrivalTimestamp` string  NOT NULL,
  `Tote` string NOT NULL,
  `content` string NOT NULL,
  `location` string NOT NULL,
);
INSERT INTO `first_table` (`Date`, `TotearrivalTimestamp`, `Tote`, `content`, `location`) VALUES
  ('2021-01-01', '13:00','A','hat','1'), #first tote
  ('2021-01-01', '13:00','A','cat','1'), #first tote
  ('2021-01-01', '14:00', 'B', 'toy', '1'),
  ('2021-01-01', '14:00', 'B', 'cat', '1'),
  ('2021-01-01', '15:00', 'A', 'toy', '1'),
  ('2021-01-01', '13:00', 'A', 'toy', '1'),
  ('2021-01-02', '13:00', 'A', 'hat', '1'),
  ('2021-01-02', '13:00', 'A', 'cat', '1');
  
CREATE OR REPLACE TABLE
`second_table` (
  `Date` string NOT NULL,
  `ToteendingTimestamp` string  NOT NULL,
  `Tote` string NOT NULL,
  `content` string NOT NULL,
  `location` string NOT NULL,
);
INSERT INTO `second_table` (`Date`, `ToteendingTimestamp`, `Tote`, `content`, `location`) VALUES
('2021-01-01', '20:00', 'B', 'cat', '2'),
('2021-01-01', '19:00', 'A', 'cat', '1'), #first tote
('2021-01-01', '19:00', 'A', 'hat', '1'), #first tote
('2021-01-01', '20:00', 'B', 'toy', '2'),
('2021-01-01', '14:00', 'A', 'toy', '1'),
('2021-01-02', '14:00', 'A', 'hat', '1'),
('2021-01-02', '14:00', 'A', 'cat', '1'),
('2021-01-01', '16:00', 'A', 'toy', '1');

CREATE OR REPLACE TABLE
`desired_result` (
  `Date` string NOT NULL,
  `Tote` string NOT NULL,
  `TotearrivalTimestamp` string  NOT NULL,
  `ToteendingTimestamp` string  NOT NULL,
  `location_first_table` string NOT NULL,
  `location_second_table` string NOT NULL,
 );
INSERT INTO `desired_result` (`Date`, `Tote`, `TotearrivalTimestamp`, `ToteendingTimestamp`, `location_first_table`, `location_second_table`) VALUES

('2021-01-01', 'A', '13:00', '19:00', '1', '1'), #first tote
('2021-01-01', 'B', '14:00', '20:00', '1', '1'),
('2021-01-01', 'A', '15:00', '16:00', '1', '2'),
('2021-01-02', 'A', '13:00', '14:00', '1', '1');


#### this does not give what I want####
select first.date as Date, first.tote, first.totearrivaltimestamp, second.toteendingtimestamp, first.location as location_first_table, second.location as location_second_table
from `first_table` first 
inner join `second_table` second 
on first.tote = second.tote 
and first.content = second.content;

Upvotes: 1

Views: 82

Answers (2)

Rory
Rory

Reputation: 169

I was able to reproduce the'desired_result' table (mostly) with the SQL below. I believe there exists a few typos with the 'insert into' statements. However, I think this meets the intent.

Query:

select  
first_table.date as Date, 
first_table.tote, 
first_table.totearrivaltimestamp, 
second_table.toteendingtimestamp, 
first_table.location as location_first_table, 
second_table.location as location_second_table
from first_table
inner join `second_table` 
on first_table.Date = second_table.Date 
and first_table.tote = second_table.tote
group by first_table.Date, first_table.TotearrivalTimestamp, first_table.tote;

result:

2021-01-01|A|13:00|19:00|1|1
2021-01-01|B|14:00|20:00|1|2
2021-01-01|A|15:00|19:00|1|1
2021-01-02|A|13:00|14:00|1|1

This result assumes your first table dates will always match for totes/timestamps. The group by function then merges duplicate results. The second table information matches the date and tote of the first table and is appended to the line item.

Upvotes: 1

SCCJS
SCCJS

Reputation: 96

This answer should work. I think your issue might be with some of your quoting of tables....

select f.'date'
,f.tote
, f.totearrivaltimestamp
, s.toteendingtimestamp
, f.location as location_first_table
, s.location as location_second_table
from first f
,INNER JOIN "second" s on f.'date' = s.'date'
and f.tote = s.tote 
and f.content = s.content

Upvotes: 1

Related Questions