filter dataframe columns by data in matrix rows r

Question

I have a dataframe in R with approximately 500 rows and the following columns x y z, and I have a matrix with a single column and 3 three rows, calls them a b c. I want to filter the first dataframe based on the values in the matrix rows. Basically, I want to find the row in my first dataset that has values in column x that are closest to the value in row a of the matrix, values in column y are closest to the value of row b in the matrix, and values in column z are closest to the values of row z of the matrix. I feel that this should be quite straightforward, but I must be missing something here.

Basically, I just need to return the row in the dataframe with the values that match closest the data in the matrix so I can determine which row is most representative of the matrix.

Here's an example:

x <- c(52, -36, 45, 756, 12, 23, 45)
y <- c(34, 56, 68, 23, -4, 2, 23)
z <- c(-1, 2, 5, 4, 6, -4, 3)

df <- data.frame(x, y, z)
vector <- c(-60,20,7)

I want to filter df based on the values in vector so that I return a single row that has values across the three columns that closest matches the vector.

Ronak Shah · Accepted Answer

One way would be to subtract the dataframe with vector meaning subtract column 1 with vector[1], column 2 with vector[2] and so on, take absolute value, rowwise sum the differences and select the row that has minimum value.

df[which.min(rowSums(abs(sweep(df, 2, vector, `-`)))), ]

filter dataframe columns by data in matrix rows r

Answers (1)

Related Questions