Reputation: 101
I have a data frame with 300 columns, I want to split the data frame depending on values in a column Millage (MPG)
Model MPG Origin
1 chevrolet chevelle malibu 18.0 US
2 buick skylark 320 15.0 US
3 plymouth satellite 18.0 US
4 amc rebel sst 16.0 US
5 ford torino 17.0 US
6 ford galaxie 500 15.0 US
7 chevrolet impala 14.0 US
8 plymouth fury iii 14.0 US
9 pontiac catalina 14.0 US
10 amc ambassador dpl 15.0 US
11 dodge challenger se 15.0 US
I want to split the data frame such that.
I have a data frame with MPG's less than 14 , 14-17 , greater than 17.
y is my parent data set i want to split it into low, medium and high datasets with the values specified above.
I was trying to us for loop to append the values less than 13.6 and then insert the matrix into a separate data frame named low.
for(i in 1:nrow(y)){
if(y[i,2] <13.6){
low_arrayMPG.append(y[i,2])
low_arrayModel.append(y[i,1])
low_arrayOrigin.append(y[i,3])
}
}
Could anyone help me if the approach is right or is there any function in R which i can use for this exact purpose which will make it easier to split the data frames into desired sub data frames ?
Upvotes: 0
Views: 994
Reputation: 47350
Maybe you'll also like these:
split(df1,(df1$MPG>=14)+(df1$MPG>17))
# $`1`
# Model MPG Origin
# 2 buick skylark 320 15 US
# 4 amc rebel sst 16 US
# 5 ford torino 17 US
# 6 ford galaxie 500 15 US
# 7 chevrolet impala 14 US
# 8 plymouth fury iii 14 US
# 9 pontiac catalina 14 US
# 10 amc ambassador dpl 15 US
# 11 dodge challenger se 15 US
#
# $`2`
# Model MPG Origin
# 1 chevrolet chevelle malibu 18 US
# 3 plymouth satellite 18 US
library(dplyr)
library(tidyr)
df1 %>% group_by(spl = (MPG>=14) + (MPG>17)) %>% nest
# # A tibble: 2 x 2
# spl data
# <int> <list>
# 1 2 <tibble [2 x 3]>
# 2 1 <tibble [9 x 3]>
data
df1 <- read.table(text=" Model MPG Origin
1 'chevrolet chevelle malibu' 18.0 US
2 ' buick skylark 320' 15.0 US
3 ' plymouth satellite' 18.0 US
4 ' amc rebel sst' 16.0 US
5 ' ford torino' 17.0 US
6 ' ford galaxie 500' 15.0 US
7 ' chevrolet impala' 14.0 US
8 ' plymouth fury iii' 14.0 US
9 ' pontiac catalina' 14.0 US
10 ' amc ambassador dpl' 15.0 US
11 ' dodge challenger se' 15.0 US",header=T,stringsAsFactors=F)
Upvotes: 0
Reputation: 887851
We could use findInterval
to create a grouping variable for split
ting the dataset into a list
of data.frame
s
lst <- split(df1, findInterval(df1$MPG, c(14, 17), rightmost.closed = TRUE))
Upvotes: 3
Reputation: 522636
I think you can just subset your data frame (df
) as follows:
df_low <- df[df$MPR < 14, ]
df_medium <- df[df$MPR >= 14 & df$MPR <= 17, ]
df_high <- df[df$MPR > 17, ]
Upvotes: 3