Sebastian
Sebastian

Reputation: 633

subset rows and columns in a dataframe based on boundary conditions

I have some problems to express myself. Probably, that is why I havent found anything which helps me yet. The example should make clear what I want. Suppose I have a m x m matrix structure of coordinates. Lets say it ranges from A1 to E5 . and I want to subset the rows/columns which are k lines away from the outer coordinates.

In my example k is 2. So I want to select all records in the data frame which have the coordinates B2, B3, B4, C2, C4, D2, D3, D4. Manually, I would do the following:

cc <- data.frame(x=(LETTERS[1:5]), y=c(rep(1,5),rep(2,5),rep(3,5), rep(4,5), rep(5,5)) , z=rnorm(25))
slct <- with(cc, which( (x=="B" | x=="C" | x=="D" ) & (y==2 | y==3 | y==4) & !(x=="C" & y==3) ))
cc[slct,] # result data frame

But if the matrix dimensions increase that is not the way which will work great. Any better ideas?

Upvotes: 0

Views: 482

Answers (2)

Backlin
Backlin

Reputation: 14872

Rather hard to read but it does the trick.

m <- 5   # Matrix dimensions
k <- 2   # The index of the the inner square that you want to extract
cc[(cc$x %in% LETTERS[c(k,m-k+1)] & !cc$y %in% c(1:(k-1), m:(m-k+2))) |
   (cc$y %in% c(k, m-k+1)         & !cc$x %in% LETTERS[c(1:(k-1), m:(m-k+2))]),]

The first line of comparisons extracts the k:th column from the left and right edges of the matrix, but not the parts that are closer than k to the upper and lower edges. The second line does the same thing but for rows.

Upvotes: 2

Andy
Andy

Reputation: 4659

cc$xy <- paste0(cc$x,cc$y)

coords <- c("B2","B3","B4", "C2", "C4", "D2", "D3", "D4")
cc[cc$xy %in% coords,]

#   x y          z xy
#7  B 2 -0.9031472 B2
#8  C 2 -0.1405147 C2
#9  D 2  1.6017619 D2
#12 B 3  1.7713041 B3
#14 D 3 -0.2005749 D3
#17 B 4  1.8671238 B4
#18 C 4  0.3428815 C4
#19 D 4  0.1470436 D4

Upvotes: 2

Related Questions