Reputation: 665
Could someone throw me ideas about how to resolve the following issue:
I have pictures, videos and tags in my database. Tags can be associated to pictures and videos. I have to query database for tag IDs and count that are modified before 7 days or earlier and sort them by count of associations they have to picture and video assets.
So the idea is that at the end I can output the tags that were associated to pictures or videos the most during last 7 days. I wrote down the DB structure with the fields I have
VIDEO
ID
PICTURE
ID
PICTURE_ATTRMAPPING
CS_OWNERID (ID of picture)
CS_ATTRID ( will store ID of the tag attribute (picture_tag or video_tag))
ASSETVALUE ( will store ID OF THE associated tag)
VIDEO_ATTRMAPPING
CS_OWNERID (ID of video)
CS_ATTRID ( will store ID of the tag attribute (picture_tag or video_tag))
ASSETVALUE ( will store ID OF THE associated tag)
TAG
ID
UPDATEDATE
ATTRIBUTES (picture_tag attribute ID is stored here)
ID
NAME
So we can see that tags, videos, picture and attributes are stored in separate tables. We can tell that tag is referenced by video or picture (in other words, video/picture has a tag association) if VIDEO/PICTURE_ATTRMAPPING table has record with video or picture tag attribute id in CS_ATTRID column and id of the tag in ASSETVALUE column.
I am assuming it will be query with sub-queries, so I started to break this task in sub-tasks and figure out how to obtain all the information that's needed.
I will definitely have to get the ID of the tag attributes for video and picture objects:
SELECT id FROM ATTRIBUTES WHERE NAME = 'picture_tag' OR NAME = 'video_tag'
Also an example of how I can query for tags that are n days old:
SELECT id FROM TAG WHERE updateddate BETWEEN TO_DATE('2013-08-20 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND CURRENT_DATE
This probably isn't that complex as it seems, however, I am not sure about how the whole query should look like and with what to start. Could someone throw an idea or sample?
I am using oracle, but am familiar with mysql too, so samples from any of the DBMSs would be great. Let me know if I made myself clear enough.
Upvotes: 0
Views: 116
Reputation: 7928
all tag ids assigned to the pictures:
SELECT ASSETVALUE
FROM PICTURE_ATTRMAPPING
JOIN ATTRIBUTES
ON ( ATTRIBUTES.ID = PICTURE_ATTRMAPPING.CS_ATTRID
AND ATTRIBUTES.NAME = 'picture_tag'
)
all tag ids assigned to the videos:
SELECT ASSETVALUE
FROM VIDEO_ATTRMAPPING
JOIN ATTRIBUTES
ON ( ATTRIBUTES.ID = PICTURE_ATTRMAPPING.CS_ATTRID
AND ATTRIBUTES.NAME = 'video_tag'
)
use OUTER JOIN to count pictures and videos wich are tagged
SELECT
TAG.ID
, SUM(CASE WHEN PT.ASSETVALUE is not NULL THEN 1 ELSE 0 END) as tagged_picture_number
, SUM(CASE WHEN VT.ASSETVALUE is not NULL THEN 1 ELSE 0 END) as tagged_vieos_number
FROM
TAG
LEFT OUTER JOIN
(
SELECT ASSETVALUE
FROM PICTURE_ATTRMAPPING
JOIN ATTRIBUTES
ON ( ATTRIBUTES.ID = PICTURE_ATTRMAPPING.CS_ATTRID AND ATTRIBUTES.NAME = 'picture_tag')
) PT
ON ( PT.ASSETVALUE = TAG.ID)
LEFT OUTER JOIN
(
SELECT ASSETVALUE
FROM VIDEO_ATTRMAPPING
JOIN ATTRIBUTES
ON ( ATTRIBUTES.ID = PICTURE_ATTRMAPPING.CS_ATTRID AND ATTRIBUTES.NAME = 'video_tag')
) VT
ON ( PT.ASSETVALUE = TAG.ID)
WHERE
TAG.UPDATEDATE <= TRUNC(SYSDATE, 'DD') - 7
group by TAG.ID
order by tagged_picture_number + tagged_vieos_number DESC
;
Upvotes: 0
Reputation: 52853
The tags that were updated in the last 7 days can be written as:
select id from tags where updatedate >= sysdate - 7
From what you're saying the ATTRIBUTES table only has two values in it; so you can ignore it. Even if it has more your tables PICTURE_ATTRMAPPING and VIDEO_ATTRMAPPING ensure that the attributes are enforced when you join to them so this doesn't matter either.
If you want a tag to be associated to a picture or a video you just need to enforce that it exists in one of the *_ARRTMAPPING tables:
select *
from tags t
left outer join picture_attrmappings pa
on t.id = pa.assetvalue
left outer join video_attrmappings va
on t.id = va.assetvalue
where t.updatedate >= sysdate - 7
and ( pa.assetvalue is not null
or va.assetvalue is not null
)
You then want the tag IDs that have been modified most; so you need to ORDER BY the count:
select t.id
from tags t
left outer join picture_attrmappings pa
on t.id = pa.assetvalue
left outer join video_attrmappings va
on t.id = va.assetvalue
where t.updatedate >= sysdate - 7
and ( pa.assetvalue is not null
or va.assetvalue is not null
)
group by t.id
order by count(*) desc
On a slightly different note this is quite a strange schema (assuming you've left nothing out). I would expect PICTURE_ATTRMAPPINGS to be a junction table between TAGS and PICTURE. The TAGS table should store a unique list of tags but this doesn't seem to be what it does. The UPDATEDATE should then be in either PICTURE_ARRTMAPPINGS as you can then know when each tag was last updated or in PICTURE so you know when all tags were last updated (or both).
I suspect you're missing some of the schema but I see no method, with what you've provided, of using either PICTURE or VIDEO.
Upvotes: 1