Reputation: 1354
I think that this is similiar but it is not the same as a previous question that I have asked here Pull specific rows
Here is the code that I am now working with:
City <- c("x","x","y","y","z","z")
Type <- c("a","b","a","b","a","b")
Value <- c(1,3,2,5,6,10)
cbind.data.frame(City,Type,Value)
Which produces:
City Type Value
1 x a 1
2 x b 3
3 y a 2
4 y b 5
5 z a 6
6 z b 10
I want to do something similar as before but now if two different conditions must be met to pull a specific number. Lets say we had a matrix,
testmat <- matrix(c("x","x","y","a","b","b"),ncol=2)
Which looks like this:
[,1] [,2]
[1,] "x" "a"
[2,] "x" "b"
[3,] "y" "b"
The desired outcome is
[,1] [,2] [,3]
[1,] "x" "a" 1
[2,] "x" "b" 3
[3,] "y" "b" 5
Another Question PLEASE ANSWER THIS PART
City <- c("x","x","x","x","y","y","x","z")
Type <- c("a","a","a","a","a","b","a","b")
Value <- c(1,3,2,5,6,10,11,15)
mat <- cbind.data.frame(City,Type,Value)
mat
testmat <- matrix(c("y","x","b","a"),ncol=2)
testmat <- data.frame(testmat)
testmat
test <- inner_join(mat,testmat,by = c("City"="X1", "Type"="X2"))
How come when I try to use the inner_join function it gives me a warning message. Here is the warning message that I get....
In inner_join_impl(x, y, by$x, by$y) :
joining factors with different levels, coercing to character vector
This is the desired output, is...
City Type Value
1 y b 10
2 x a 1
3 x a 3
4 x a 2
5 x a 5
6 x a 11
but it is producing...
City Type Value
1 x a 1
2 x a 3
3 x a 2
4 x a 5
5 y b 10
6 x a 11
I want the inner_join function to produce the values in which they are presented first in the testmat, as shown above. So if since City "y" of type "b" comes first in the testmat I want it to come first in the values for "test"
Upvotes: 0
Views: 569
Reputation: 3223
Answer to second part: The warning states, that you try to join on two factors with different levels. Therefor, the variables are coerced into "character" before joining, theres no problem with that. As Mostafa Rezaei mentioned in his answer R is coercing factors from character-vectors when creating a dataframe. Usually it's best to leave characters:
mat <- data.frame(City,Type,Value, stringsAsFactors=F)
testmat <- data.frame(testmat, stringsAsFactors=F)
Concerning your real question:
The order of the result of a join is not defined. If order is crucial to you, you can use an additional sorting variable:
mat %>%
mutate(rn = row_number()) %>%
semi_join(testmat, by = c("City"="X1", "Type"="X2")) %>%
arrange(rn)
btw: I think your looking for an semi_join rather than an inner_join, read the help file for differences.
Upvotes: 0
Reputation: 629
The warning is because R treats string vectors as factor type. you can change this behaviour by running the following code at the start of your script:
options(stringsAsFactors = FALSE)
Upvotes: 0
Reputation: 1354
The solution is to just switch the order of testmat and mat, like so..
test <- inner_join(testmat,mat,by = c("X1"="City", "X2"="Type"))
I find it interesting that the order of the by
parameter needs to be in the same order of the data frames being passed throught the innerjoin
function.
Upvotes: 2