Reputation: 45
What I want to accomplish: I want to join 2 tables on urls. But one table has url formats that are a bit different from the other. As they are the only common denominator between the two tables, (no unique id's to join on) I have to use wildcards to match the url's.
For example one table has formats like these: 'www.url.com', 'url.com.', 'abc.def.url.com' while another table has format 'url.com' So in order to match I need to do something like
LEFT JOIN t1
ON t1.url = '%.'||t2.url OR t1.url= t2.url||'.' OR etc...
Is this the right way to approach this?
I'm using PostgreSQL.
Upvotes: 1
Views: 84
Reputation: 1099
You have a very strict constraint as you are joining on URLs. url.com should match url.com and www.url.com should match url.com but www.dummyurl.com shouldnt match www.myurl.com. Your original query is fine and is the right way to approach the problem.
To avoid overmatching, would recommend using this
select * from t1 inner join t2
on t1.url like '%.'||t2.url or t1.url = t2.url
I have used inner join to avoid NULL matching Here's a demo
This assumes that t2.url will be at the end of t1.uel after a period or both strictly match. Try running this and see if you get your anticipated results
Upvotes: 1
Reputation: 1780
Just a guess, can you use CONTAINS()?
LEFT JOIN t1
ON CONTAINS(t1.url, t2.url)
Edit: Well, seems that PostGRE doesn't support CONTAINS.
Try using position() function. It returns 0 if the substring is not found
ON position(t2.url in t1.url) <> 0
Upvotes: 0