Reputation: 71
This is my first post and very stuck on trying to build my first function that calculates Herfindahl measures on Firm gross output, using panel data (year=1998:2007) with firms = obs. by year (1998-2007) and region ("West","Central","East","NE") and am having problems with passing arguments through the function. I think I need to use two loops (one for time and one for region). Any help would be useful.. I really dont want to have to subset my data 400+ times to get herfindahl measures one at a time. Thanks in advance!
Below I provide: 1) My starter code (only returns one value); 2) desired output (2-bins that contain the hefindahl measures by 1) year and by 2) year-region); and 3) original data
myherf<- function (x, time, region){
time = year # variable is defined in my data and includes c(1998:2007)
region = region # Variable is defined in my data, c("West", "Central","East","NE")
for (i in 1:length(time)) {
for (j in 1:length(region)) {
herf[i,j] <- x/sum(x)
herf[i,j] <- herf[i,j]^2
herf[i,j] <- sum(herf[i,j])^1/2
}
}
return(herf[i,j])
}
myherf(extractiveoutput$x, i, j)
Error in herf[i, j] <- x/sum(x) : object 'herf' not found
2) My desired outcome is the following two vectors:
A. (1x10 vector)
Year herfindahl(yr)
1998 x
1999 x
...
2007 x
B. (1x40 vector)
Year Region hefindahl(yr-region)
1998 West x
1998 Central x
1998 East x
1998 NE x
...
2007 West x
2007 Central x
2007 East x
2007 northeast x
3) Original Data
Obs. industry year region grossoutput
1 06 1998 Central 0.048804830
2 07 1998 Central 0.011222478
3 08 1998 Central 0.002851575
4 09 1998 Central 0.009515881
5 10 1998 Central 0.0067931
...
12 06 1999 Central 0.050861447
13 07 1999 Central 0.008421093
14 08 1999 Central 0.002034649
15 09 1999 Central 0.010651283
16 10 1999 Central 0.007766118
...
111 06 1998 East 0.036787413
112 07 1998 East 0.054958377
113 08 1998 East 0.007390260
114 09 1998 East 0.010766598
115 10 1998 East 0.015843418
...
436 31 2007 West 0.166044176
437 32 2007 West 0.400031011
438 33 2007 West 0.133472059
439 34 2007 West 0.043669662
440 45 2007 West 0.017904620
Upvotes: 1
Views: 1143
Reputation: 13817
You can use the conc
function from the ineq
library. The solution gets really simple and fast using data.table
.
library(ineq)
library(data.table)
# convert your data.frame into a data.table
setDT(df)
# calculate inequality of grossoutput by region and year
df[, .(inequality = conc(grossoutput, type = "Herfindahl")), by=.(region, year) ]
Upvotes: 1