Worville11
Worville11

Reputation: 93

Calculating Gini by Row in R

stackoverflow.

I'm trying to calculate the gini coefficient within each row of my dataframe, which is 1326 rows long, by 6 columns (1326 x 6).

My current code...

attacks$attack_gini  <- gini(x = c(attacks$attempts_open_play,
attacks$attempts_corners,attacks$attempts_throws,
attacks$attempts_fk,attacks$attempts_set_play,attacks$attempts_penalties))

... fills all the rows with the same figure of 0.7522439 - which is evidently wrong.

Note: I'm using the gini function from the reldist package.

Is there a way that I can calculate the gini for the 6 columns in each row?

Thanks in advance.

Upvotes: 1

Views: 2819

Answers (1)

lrnzcig
lrnzcig

Reputation: 3947

Function gini of reldist does not accept a dataframe as an input. You could easily get the coefficient of the first column of your dataframe like this:

> gini(attacks$attempts_open_play)
[1] 0.1124042 

However when you do c(attacks$attempts_open_play, attacks$attempts_corners, ...) you are actually generating one list with all the columns of your dataframe just after the other, thus your gini call gives back a single number, e.g.:

> gini(c(attacks$attempts_open_play, attacks$attempts_corners))
[1] 0.112174

And that's why you are assigning the same single number to every row at attacks$attack_gini. If I understood properly, you what to calculate the gini coefficient for the values of your columns per row, you can use apply, something like

attacks$attack_gini <- apply(attacks[,c('attempts_open_play', 'attempts_corners', ...)], 1, gini)

where the 2nd parameter with value 1 is applying the function gini per row.

head(apply(attacks[,c('attempts_open_play', 'attempts_corners')], 1, gini))
[1] 0.026315789 0.044247788 0.008928571 0.053459119 0.019148936 0.007537688

Hope it helps.

Upvotes: 2

Related Questions