Reputation: 1140
Can I join two tables using custom condition rather than equal?
Eg, I have two tables in BigQuery. Table A has 3 columns start_range
, end_range
and grade
. Table B has data coming from Storage using cloud functions and has a particular column "marks". I am creating a flow in DataPrep such that join these tables which should satisfy the condition:
SELECT data.grade FROM data INNER JOIN student_data ON student_data.marks BETWEEN data.start_range AND data.end_range
and then paste the data to a new table. But I could not find any provision for custom condition. Only two columns can be matched for join. Any idea?
Upvotes: 1
Views: 418
Reputation: 1672
I think that your best chance would be to implement your solution in BigQuery:
SELECT
grade, marks
FROM
`project.dataset.student_data` a
JOIN
`project.dataset.data` b
ON
a.marks > b.start_range AND a.marks < b.end_range
If you insist on Dataprep, I don't believe there's any way to join on a range. If I understood your case correctly, what you have is something like this:
...and you want to get As and Bs instead of the integer marks in student_data
. It's probably not the most elegant solution, but you can use a new formula with a case, something like this:
CASE([(marks >= 0) && (marks < 50),
'D',
(marks >= 50) && (marks < 100),
'C',
(marks >= 100) && (marks < 150),
'B',
(marks >= 150) && (marks <= 200),
'A'])
You'd just need to click "New step" on the recipe, chose "New formula" and then add it under "Formula". It should look like so:
Then you can even use "Lookup" to join any other useful data you might have in the data
table using "grade" as the lookup key.
Upvotes: 1