Fatma S
Fatma S

Reputation: 185

Using SQL join and subquery to query two tables in R

I'm a beginner.

I have two .txt files and I'm using R with sqldf pakage to query them

The first table (venues.txt) look like this:

userID,venueID,year,month,date,hour
1302,47,2012,2,24,11
45,132,2012,2,24,11
24844,86,2012,2,24,11
896,248,2012,2,24,11
5020,29,2012,2,24,11

The second table (friends.txt) look like this:

userID,friendID
1,5
1,9
1,50
1,102
1,300

I want to query the venues (venueID) that a user (say userID=1) visited WITH one or more of his friends (friendID)

Note: both userID,friendID of friends table could be linked to userID in venues table

the query results should look like this:

venueID  friendID
47       5
47       9
29       102
86       102

I can do this using many separate queries and then join them in a table but my dataset is very larg. Is there any easier way to do this?

I was able to query all venues that have been visited by a user or his friends:

sqldf("select userID, venueID from data
       where userID=1 OR userID IN (select friendID from freind where userID=1)")

Many thanks.

Upvotes: 4

Views: 2026

Answers (1)

Ali
Ali

Reputation: 949

I am a Java pl/sql developer so here is my shot to answer to:"a list of venues that were visited by at least two friends" using only join and assuming data from venues.txt is called venues and friends.txt file is called friend in the FROM clause. Basically, I am assuming that those files are tables.

SELECT v1.venueID, f.friendID

FROM venues v1 
INNER JOIN friends f ON v1.userID = f.userID 
INNER JOIN venues v2 ON v2.userID = f.friendID

WHERE
   v1.venueID = v2.venueID

and if you want to add more conditions i.e."at least two friends visited together, so having the same year, month, date, hour" then just add them to the filter(WHERE clause). So the query would look like this:

SELECT v1.venueID, f.friendID

FROM venues v1 
INNER JOIN friends f ON v1.userID = f.userID 
INNER JOIN venues v2 ON v2.userID = f.friendID

WHERE
   v1.venueID = v2.venueID
   v1.year = v2.year
   v1.month = v2.month
   v1.date = v2.date
   v1.hour = v2.hour

You might need to use DISTINCT in the SELECT statement if there are more than 2 friends at the venue(or optionally at the same time).

Upvotes: 1

Related Questions