user1907822
user1907822

Reputation: 21

Compare more than 2 lists in R

I have been learning R programming language for a month and have some difficulties in lists and dataframes. I couldn’t figure out how to find the intersection between more than 2 lists. I created four lists , which contain name, gender, age, 3 favorite movies , Support to UN, birth day and month of immediate family member :

x<- list("Corinna Neubach", "female", 24, list("Film1","Film2","Film3"), TRUE,list("31.05",  "19.12"))
z<- list("Yasmin Ritschl","female", 21, list("Film6","Film7","Film8"), TRUE, list("20.03", "10.12"))
a<- list("Stefan Braun", "male", 23, list("Film6","Film7","Film8"),TRUE,list("25.06", "15.12"))
y<- list("Melissa Okay", "female", 23, list("Film3","Film4","Film5"), TRUE,list("31.05", "10.12"))

I would like to check, if there is any shared birthday or names in the four lists. First I wrote a code with „Reduce“, but it doesnt give the solution which I want to have. Then, I have tried it with intersect but I think there should be a simplier way to do that

intersect(x[[6]],y[[6]])
intersect(x[1],y[1])
intersect(x[[6]],z[[6]])
intersect(x[1],z[1])
intersect(y[[6]],z[[6]])
intersect(y[1],z[1])
intersect(x[[6]],z[[6]])
intersect(x[1],z[1])
intersect(a[[6]],x[[6]])
intersect(a[1],x[1])
intersect(a[[6]],z[[6]])
intersect(a[1],z[1])
intersect(a[[6]],y[[6]])
intersect(a[1],y[1])

Upvotes: 1

Views: 497

Answers (1)

cbeleites
cbeleites

Reputation: 14093

First of all, I don't think the single-person lists are the appropriate data structure for your task. They all have the same structure, this is an indicator that data.frame would be appropriate.

While data.frames can contain lists inside their elements, your data suggests to translate the lists into tables of a normalized relational data base. You can map that to 2 or 3 data.frames in R:

  • the person data
  • The 3 favourite films: if they are ordered (1st, 2nd, 3rd choice), you can use data.frame colums of the person table for that. If not, pull them into an extra data.frame with columns person and film.
  • For the birth date of the relatives, I guess it is accidentally that your example data gives exactly 2 of them for each person. So pull that into another data.frame.

For hunting duplicates, have a look at ? table.


edit: wrt. the requirement to build a list: data.frames are lists in R:

> a <- data.frame (person = "John Doe", gender = "female")
> a
    person gender
1 John Doe female
> is.list (a)
[1] TRUE

Upvotes: 2

Related Questions