mslot
mslot

Reputation: 5224

Aggregate and sum a dataframe

I want to aggregate over a dataframe, and sum it up by category. I have this

my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
                       ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
                       Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,120),
                       Tax = c(2,4,5,6,2,3,5,1,3,4,5,6,4,3))
aggregate(x = my_basket[3,], by = list(my_basket[1,]), FUN = sum)

But it gives me an error telling me that

Error in aggregate.data.frame(x = my_basket[3, ], by = list(my_basket[1, : arguments must have same length Calls: -> -> aggregate -> aggregate.data.frame Execution halted

How should I reference the column by index?

I am new to R, and i think i dont quite get how to reference a dataframe by column. All examples i see uses names. I can't narrow my search well enough, therefore this question.

Upvotes: 1

Views: 68

Answers (2)

akrun
akrun

Reputation: 886938

We could also use formula method in aggregate

aggregate(Price ~ ITEM_GROUP, my_basket, FUN = sum)

Upvotes: 2

GKi
GKi

Reputation: 39647

With [1,] you are subsetting rows and not columns. With [,1] you select the first column as a vector. With [1] you select the first column as a data.frame.

aggregate(x = my_basket[3], by = my_basket[1], FUN = sum)
#  ITEM_GROUP Price
#1      Dairy   305
#2      Fruit   415
#3  Vegetable   225

Upvotes: 2

Related Questions