OctaveParango
OctaveParango

Reputation: 331

createing sub matrix/dataframe R

Question about dataframe manipulations. not HW, although I will phrase it as a HW problem. I know there is a 3 line way of doing this but I can't seem to find it.

Problem
I have a dataframe A with 10 columns and a thousand rows, it's filled with some data. I have a 1-column matrix B with 1 column and a thousand rows, it's filled by 1 and 0 (TRUE or FALSE).

Goal: create a new dataframe C that contains only the rows in A that had a TRUE value in the equivalent index of matrix C.

(More clarification in case its not clear) I performed operations on A, stored the result (true or false) in a column dataframe (of same length of course) and now wish to extract that into its own matrix.
The size of the new matrix is sum(B) X 10 (just a logical deduction)

Thank you for reading! Your help is appreciated.

Example

Dataframe A

Name   State   metric1 metric2

Joe     MA      23       25
Moe     AZ      123      2971
Bo      CA      938      387
Yo      UT       1        3

matrix B

1
0
1
0

Desired Output (dataframe C)

Name   State   metric1 metric2

Joe     MA      23       25
Bo      CA      938      387

Upvotes: 0

Views: 1124

Answers (2)

rafa.pereira
rafa.pereira

Reputation: 13817

Alternative using data.table:

library(data.table)

# convert you dataset into a data.table
  setDT(A)

# create a dummy variable (TRUE or FALSE) using operations other columns
  A[ , logical := sum(metric1, metric1) > 50 ]

# subset the data 
  A[ logical == T]

Upvotes: 1

rbm
rbm

Reputation: 3253

Why not simply

A <- data.frame(cbind(A=1:10,B=21:30,C=41:50))
B <- data.frame(D=rep(c(TRUE,FALSE),5))
C <- A[B$D,]

giving

> A
    A  B  C
1   1 21 41
2   2 22 42
3   3 23 43
4   4 24 44
5   5 25 45
6   6 26 46
7   7 27 47
8   8 28 48
9   9 29 49
10 10 30 50
> B
       D
1   TRUE
2  FALSE
3   TRUE
4  FALSE
5   TRUE
6  FALSE
7   TRUE
8  FALSE
9   TRUE
10 FALSE
> C
  A  B  C
1 1 21 41
3 3 23 43
5 5 25 45
7 7 27 47
9 9 29 49

EDIT after OP's edit

A <- data.frame(cbind(name=c("Joe",  "Moe",   "Bo" ,  "Yo"),
                      State=c("MA","AZ","CA","UT"),
                      metric1 = c(23,123,938,1),
                      metric2 = c(25,2971,387,3)))
B <- c(1,0,1,0)
C <- A[B==1,]
C

giving

> C
  name State metric1 metric2
1  Joe    MA      23      25
3   Bo    CA     938     387

Upvotes: 1

Related Questions