Alytas
Alytas

Reputation: 61

How to delete rows in one table, based on the values of another table

I have a problem. I have two different tables. One that contains user information, and their ID numbers and another table that also has ID numbers and a second column that says whether the User is an admin or not.

Table 1

UserID AssigID Score      Date       Time

14532   23956   52      11.11.2017    401
12254   23956   22      18.11.2017    256
12644   23956   74      17.12.2012    365
11257   23957   45      10.10.2012    102
12667   23958   25      10.11.2012    109

Table 2

UserID  Admin

14532   TRUE
12254   FALSE
12644   FALSE
11257   FALSE
12667   FALSE

As in table 2 userID 14532 is an admin, I would like to delete the data of this user from the table1. How to delete in table 1 the row where user is an Admin, based on the data from the Table 2?

As

Upvotes: 1

Views: 974

Answers (1)

Jaap
Jaap

Reputation: 83215

A possible approach with base R:

tab1[tab1$UserID %in% tab2$UserID[!tab2$Admin],]

which gives:

  UserID AssigID Score           TimeStamp TimeOnTask
2  12254   23956    22 2017-11-18 13:16:00        256
3  12644   23956    74 2012-12-17 13:18:00        365
4  11257   23957    45 2012-10-10 13:29:00        102
5  12667   23958    25 2012-11-10 13:40:00        109

What this does:

  • tab2$UserID[!tab2$Admin] gives a vector of user ID's that are not an Admin. The !tab2$Admin part makes sure only the ID's that are not an Admin are selected.
  • with tab1$UserID %in% ... you select only the user ID's from tab1 that are in the vector from the first step. This returns a logical vector with which you subsequently subset tab1

Used data:

tab1 <- structure(list(UserID = c(14532L, 12254L, 12644L, 11257L, 12667L),
                       AssigID = c(23956L, 23956L, 23956L, 23957L, 23958L),
                       Score = c(52L, 22L, 74L, 45L, 25L),
                       TimeStamp = structure(c(1510402260, 1511007360, 1355746680, 1349868540, 1352551200), class = c("POSIXct", "POSIXt"), tzone = ""),
                       TimeOnTask = c(401L, 256L, 365L, 102L, 109L)),
                  .Names = c("UserID", "AssigID", "Score", "TimeStamp", "TimeOnTask"), row.names = c(NA, -5L), class = "data.frame")
tab2 <- structure(list(UserID = c(14532L, 12254L, 12644L, 11257L, 12667L),
                       Admin = c(TRUE, FALSE, FALSE, FALSE, FALSE)),
                  .Names = c("UserID", "Admin"), class = "data.frame", row.names = c(NA, -5L))

Upvotes: 2

Related Questions