Reputation: 5224
I want to aggregate over a dataframe, and sum it up by category. I have this
my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,120),
Tax = c(2,4,5,6,2,3,5,1,3,4,5,6,4,3))
aggregate(x = my_basket[3,], by = list(my_basket[1,]), FUN = sum)
But it gives me an error telling me that
Error in aggregate.data.frame(x = my_basket[3, ], by = list(my_basket[1, : arguments must have same length Calls: -> -> aggregate -> aggregate.data.frame Execution halted
How should I reference the column by index?
I am new to R, and i think i dont quite get how to reference a dataframe by column. All examples i see uses names. I can't narrow my search well enough, therefore this question.
Upvotes: 1
Views: 68
Reputation: 886938
We could also use formula method in aggregate
aggregate(Price ~ ITEM_GROUP, my_basket, FUN = sum)
Upvotes: 2
Reputation: 39647
With [1,]
you are subsetting rows and not columns. With [,1]
you select the first column as a vector. With [1]
you select the first column as a data.frame.
aggregate(x = my_basket[3], by = my_basket[1], FUN = sum)
# ITEM_GROUP Price
#1 Dairy 305
#2 Fruit 415
#3 Vegetable 225
Upvotes: 2