Reputation: 12148
I have two tables in my database - gl_weather_locations and gl_weather_data.
I need to select location name, latitude and longitude from the locations table, then get only the most recent set of records for the corresponding lat/long in the data table - there can be multiple sets, as the database is populated from a web service that's updated every 2 hours. We need to keep archival data, so we can't keep only the most recent set of records, we need to keep all of them. But I'm a little confused as to how to limit my query to only the most recent set (there are multiples of 3 data records for each location - one for wind speed, one for wind direction, and one for wave height - we need to get the most recent of all three.)
What I have so far is:
SELECT l.location_name, l.location_lat, l.location_long, d.weather_type,
d.weather_value, d.weather_unit
FROM gl_weather_locations l
LEFT JOIN gl_weather_data d ON l.location_lat = d.weather_lat
(In our case, we're working with a limited number of locations and it's absolutely confirmed that we do not have more than one location with the exact same latitude, so we can safely do the join on only the latitude, rather than needing to also compare longitude.)
I'm confused as to how I can make sure I only get the most recent set of records for each latitude. If it helps, the weather_time field is identical for all of the records in a set (i.e. we have 10 locations, 3 records per location per run of the web service, so we have 30 records with identical values in weather_time.) Could I somehow use that in an order by or group by or something?
Upvotes: 2
Views: 249
Reputation: 48179
The 2nd FROM (via SQL-Select) should get you on a per lat/long, the maximum respective time per each category. Then, the overall query links the locations to that (MaxTimes) alias, then re-joins to the weather data based on whatever the max time matching the respective classification unit being tested for. If you get any NULL values, you can just wrap those columns as NVL( whatever_column, '' ) as ResultColumn
select WLoc.ID,
WLoc.Location_Name,
WLoc.Location_Lat,
WLoc.Location_Long,
IFNULL( MaxTimes.WindDirTime, '' ) WindDirTime,
IFNULL( MaxTimes.WindSpdTime, '' ) WindSpdTime,
IFNULL( MaxTimes.WaveHeightTime, '' ) WaveHeightTime,
IFNULL( ByWindDir.Weather_Unit, '' ) WindDirection,
IFNULL( ByWindDir.Weather_Value, '' ) WindDirValue,
IFNULL( ByWindSpd.Weather_Unit, '' ) SpeedUnit,
IFNULL( ByWindSpd.Weather_Value, '' ) WindSpeed,
IFNULL( ByWaveHeight.Weather_Unit, '' ) WaveUnit,
IFNULL( ByWaveHeight.Weather_Value, '' ) WaveHeight
from
gl_weather_locations WLoc
left join (select weather_lat,
weather_long,
max( case when weather_type = "Wind Direction" then weather_time end ) WindDirTime,
max( case when weather_type = "Wind Speed" then weather_time end ) WindSpdTime,
max( case when weather_type = "Wave Height" then weather_time end ) WaveHeightTime
from
gl_weather_data
group by
1, 2 ) MaxTimes
on WLoc.Location_Lat = MaxTimes.Weather_Lat
and WLoc.Location_Long = MaxTimes.Weather_Long
left join gl_weather_data ByWindDir
on MaxTimes.weather_lat = ByWindDir.weather_lat
and MaxTimes.weather_long = ByWindDir.weather_long
and MaxTimes.WindDirTime = ByWindDir.weather_time
and ByWindDir.weather_type = "Wind Direction"
left join gl_weather_data ByWindSpd
on MaxTimes.weather_lat = ByWindSpd.weather_lat
and MaxTimes.weather_long = ByWindSpd.weather_long
and MaxTimes.WindSpdTime = ByWindSpd.weather_time
and ByWindSpd.weather_type = "Wind Speed"
left join gl_weather_data ByWaveHeight
on MaxTimes.weather_lat = ByWaveHeight.weather_lat
and MaxTimes.weather_long = ByWaveHeight.weather_long
and MaxTimes.WaveHeightTime = ByWaveHeight.weather_time
and ByWaveHeight.weather_type = "Wave Height"
The first FROM table is your list of master locations... by doing a LEFT JOIN, you are telling the SQL, no matter if I have a time-stamp match on the right side, show me all locations.
Next, is the PreQuery for MaxTimes. Think of this sub-query as pre-gathering the maximum time per lat/long, PER respective TYPE of measurement reported via the MAX( case when... ) As FinalColumnName. So, on each record, it will only qualify for ONE of these case entries, and if found, get the max of that column's time.
So, now, you have a list of locations, and left joined to a subquery MaxTimes joined on their respective LAT/LONG. Now, the MaxTimes query is doing THREE left joins back to the measurements data, for each measurement respectively, so I've aliased them on what their purpose was for .. hence ByWindDir, ByWindSpd and ByWaveHeight. The left join, like the others is based on the Lat/Long, its proper type ("Wind Direction", "Wind Speed", or "Wave Height") respectively, AND it has the same matching time OF that category.
The nice thing of this query is that if you have your times to a single lat/long location that are off, even by seconds (ie: Wind Direction is 11:58:02 and Wind Speed is 11:58:05 and Wave Height = 11:58:54, it will still respect each unique measurement's max time.
Finally, the field list... to tack on all for one row across, I'm getting the obvious details... Since each table was joined to represent its own "stuff", I have the column names represent the CONTEXT... Hope this helps your understanding of this query... Take one part at a time and find what links to what.
Upvotes: 2
Reputation: 1
try:
SELECT l.location_name, l.location_lat, l.location_long,
d.weather_type, d.weather_value, d.weather_unit FROM gl_weather_locations l
LEFT JOIN gl_weather_data d on l.location_lat = d.weather_lat
**order by weather_time desc limit 3**
Upvotes: 0