Reputation: 7
Thanks in advance! this is little long question but hopefully it is clear and concise.
I have two different query bringing two data sets.
both datasets are bringing data in same grain for similar area.
first data set has multiple snapshots per month of a season , so it have three different metrics value for each season for same grain for a season.
Dataset 1 ( country , season , month , metric is column names ) (1. A , autumn, 9/1, 1 is the record for column)
country season month metric
where as the second data set has one snapshots (just one month out of season) for a season, so it will have just one snapshot per season.
Dataset 2
country season month metric
my current approach of full outer join is giving me the following output
country season month metric1 metric2
A, autumn, 9/1, 1, 4
A, autumn, 10/1, 2,
A, autumn , 11/1, 3 ,
**Notice the metric2 for 10/1 and 11/1 are blank or zero if use nvl but what we need is to populate same value 4 for these two months as well how can we achieve this
or FINAL OUTPUT should look something like this
country season month metric1 metric2
A, autumn, 9/1, 1, 4
A, autumn, 10/1, 2, 4
A, autumn , 11/1, 3 , 4
Upvotes: 0
Views: 51
Reputation: 1270873
Use a left join
and a window function. Here is one method:
SELECT d1.country, d1.season, d1.month, d1.metric AS metric1,
COALESCE(d2.metric,
LAG(d2.metric IGNORE NULLS) OVER (PARTITION BY d1.country ORDER BY d1.month)
) AS metric2
FROM dataset1 d1 LEFT JOIN
dataset2 d2
ON d2.country = d1.country AND d2.season = d1.season;
You don't specify the exact logic for imputing the value, so this takes the most recent non-NULL
value.
Upvotes: 0
Reputation: 1624
Change the join condition.
Instead of just using
Dataset1.country = Dataset2.country
and Dataset1.month = Dataset2.month
Use the conditions in join as -
Dataset1.country = Dataset2.country
and Dataset1.month >= Dataset2.month
Now I don’t know about the data type of month. I have put the logic but to actually compare you may need to do type casting as required for proper comparison using >=
Upvotes: 0
Reputation: 1096
I'm not familiar with Snowflake at all but assuming it uses fairly standard SQL, this should be as simple as:
SELECT a.country, a.season, a.month, a.metric AS metric1, b.metric AS metric2
FROM dataset1 a
INNER JOIN dataset2 b
ON a.country = b.country AND a.season = b.season
i.e. It shouldn't need a full outer join and should need to match only on country and season. P.S. It would help if you showed the SQL you've tried as part of your question.
Upvotes: 1