Reputation: 501
I have a panel data set which has no missing values and all are numeric values only except the date which is in "month/day/year" format but in quarterly frequency.
There are no missing values at all. However, I still do not understand why my data is shown as unbalanced one when I run the "is.pbalanced()" code command.
I also cannot make it balanced by running "make.pbalanced()" code showing the errors I do not understand.
Even when I run "table(DATA$Firm, DATA$Date)" (screenshot attached below), the table output only shows 1 and 0. Thus, the id-time matches do not seem to duplicate more than these.
DATA is the data file I used (screenshot attached below as well) of this on the bottom as well. The data and output are too huge to attach full file here (only except the snapshot) so please understand.
I would appreciate if I can know how I can make this panel data usable despite of its obvious balanced characteristics I have. Thank you.
> DATA=pdata.frame(data,index=c("Firm","Date"))
> DATA<-make.pbalanced(DATA)
Error in seq.default(from = min_value, to = max_value, by = 1) :
'from' must be a finite number
In addition: Warning messages:
1: In make.pconsecutive.indexes(x, balanced = balanced, ...) :
NAs introduced by coercion
2: In min(df_index[, "times"]) :
no non-missing arguments to min; returning Inf
3: In max(df_index[, "times"]) :
no non-missing arguments to max; returning -Inf
> is.pbalanced(DATA)
[1] FALSE
If I provide dput(DATA) for the first 20 rows in DATA, the output is as follows:
> dput(DATA)
structure(list(Date = structure(c(3L, 18L, 35L, 52L, 54L, 27L,
28L, 44L, 45L, 60L, 61L, 10L, 11L, 12L, 13L, 28L, 30L, 45L, 47L,
63L), .Label = c("12/31/1998", "12/31/1999", "12/31/2000", "12/31/2002",
"12/31/2003", "12/31/2004", "12/31/2005", "12/31/2008", "12/31/2009",
"12/31/2010", "12/31/2011", "12/31/2013", "12/31/2014", "12/31/2015",
"12/31/2016", "12/31/2019", "12/31/2020", "3/31/1998", "3/31/1999",
"3/31/2000", "3/31/2001", "3/31/2004", "3/31/2005", "3/31/2006",
"3/31/2007", "3/31/2009", "3/31/2010", "3/31/2011", "3/31/2012",
"3/31/2015", "3/31/2016", "3/31/2017", "3/31/2018", "3/31/2021",
"6/30/1998", "6/30/1999", "6/30/2000", "6/30/2001", "6/30/2004",
"6/30/2005", "6/30/2006", "6/30/2007", "6/30/2009", "6/30/2010",
"6/30/2011", "6/30/2012", "6/30/2015", "6/30/2016", "6/30/2017",
"6/30/2018", "6/30/2021", "9/30/1998", "9/30/1999", "9/30/2000",
"9/30/2003", "9/30/2004", "9/30/2005", "9/30/2006", "9/30/2009",
"9/30/2010", "9/30/2011", "9/30/2012", "9/30/2014", "9/30/2015",
"9/30/2016", "9/30/2017", "9/30/2020"), class = "factor"), Firm = structure(c(1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("296", "10718", "52239", "100263", "100432",
"101273", "102798", "102931", "103660", "105309", "105334", "106599",
"106801", "107495", "107501", "107559", "107574", "107737", "107755",
"107766", "107791", "108048", "108297", "108299", "108679", "108729",
"108731", "110803", "111464", "111469", "111483", "111484", "111487",
"111489", "111492", "111493", "111503", "111506", "111509", "111514",
"111522", "111536", "111555", "111589", "111590", "111600", "111695",
"111703", "111716", "111727", "111750", "111751", "111752", "111775",
"111796", "111808", "111938", "111940", "111941", "111942", "111955",
"111956", "112001", "112028", "112066", "112137", "112153", "112347",
"112367", "112371", "112427", "112472", "112666", "112738", "112852",
"113501", "113582", "113848", "113959", "114957", "114958", "116126",
"116324", "117005", "135894", "135939", "146262", "154189", "1000001",
"1000132", "1000181", "1000198", "1000234", "1000242", "1000517",
"1000757", "1000858", "1000881", "1000897", "1001061", "1001172",
"1001283", "1001526", "1001577", "1001616", "1001915", "1002018",
"1002061", "1002312", "1002320", "1002374", "1002376", "1002587",
"1002650", "1002815", "1002827", "1002835", "1002839", "1002923",
"1003021", "1003053", "1003057", "1003059", "1003229", "1003260",
"1003405", "1003495", "1003683", "1003698", "1003943", "1004349",
"1004369", "1004594", "1004595", "1004628", "1004823", "1005002",
"1005330", "1005419", "1005420", "1005519", "1005570", "1005575",
"1005625", "1005629", "1006105", "1006110", "1006155", "1006217",
"1006232", "1006379", "1006460", "1006474", "1006511", "1006676",
"1006720", "1006781", "1006799", "1007050", "1007451", "1007518",
"1007544", "1007561", "1007564", "1007606", "1007631", "1007708",
"1007780", "1007831", "1007879", "1007890", "1007923", "1008222",
"1008290", "1008336", "1008494", "1008501", "1008521", "1008541",
"1008974", "1009297", "1009608", "1009702", "1009707", "1010040",
"1010079", "1010118", "1010171", "1010179", "1010218", "1010383",
"1010384", "1010456", "1010469", "1010513", "1010515", "1010523",
"1010559", "1010680", "1010697", "1010871", "1010884", "1010892",
"1011249", "1011315", "1011369", "1011532", "1011549", "1011550",
"1011601", "1011608", "1011628", "1011633", "1011636", "1011666",
"1011793", "1011813", "1011965", "1012183", "1012304", "1012356",
"1012472", "1012850", "1012854", "1021617", "1024280", "1028649",
"1032627", "1032628", "1037429", "1047191", "1078689", "1079592",
"1085370", "1094670", "1095030", "1095890", "1098870", "1103990",
"1116650", "1130830", "1136430", "1150911", "1164070", "1165550",
"1167911", "1169072", "1169451", "1169570", "1169574", "1177670",
"1199506", "1200034", "1200141", "1200336", "1201617", "1203212",
"1203998", "1204112", "1204249", "1205697", "1205991", "1206695",
"1209238", "1209508", "1231250", "1236950", "1239130", "1254611",
"1261831", "1278491", "1299590", "1308650", "1349851", "1364272",
"1371810", "1373451", "1415470", "1461924", "1462905", "1468726",
"1470067", "1471922", "1475575", "1492469", "1493548", "1494156",
"1497186", "1502005", "1503676", "1510039"), class = "factor"),
Country = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L), X1 = c(3.28071e+11,
2.47603e+11, 2.68036e+11, 2.8843e+11, 3.12936e+11, 1.63006e+11,
1.62064e+11, 1.62003e+11, 1.75994e+11, 1.66539e+11, 1.90875e+11,
8.48942e+11, 9.11332e+11, 9.38555e+11, 9.11507e+11, 8.80528e+11,
9.15665e+11, 8.83188e+11, 8.59914e+11, 9.23223e+11), X2 = c(420.9,
109.62, 115.46, 170.57, 256.84, 245.79, 28.1, 320.61, 37.39,
51.84, 24.73, 28.56, 149.12, 176.7, 204.86, 241.27, 245.69,
328.73, 270.17, 225.57), X3 = c(1.397e+09, 6.826e+09, 8.407e+09,
6.218e+09, 1.96e+09, 4.39e+08, 3.011e+09, 4.27e+08, 2.918e+09,
1.738e+09, 3.219e+09, 2e+05, 2e+05, 2e+05, 2e+05, 1.7844e+10,
2e+05, 1.7161e+10, 2e+05, 2e+05), X4 = c(41.6563, 21.4688,
29.8125, 37.0938, 33.6875, 28.25, 30.88, 29.31, 24.69, 28.99,
26.13, 168.84, 168.16, 127.56, 177.26, 170.63, 163.85, 131.27,
167.44, 158.21), X5 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X6 = c(0.931431568,
0.931431568, 0.931431568, 0.931431568, 0.931431568, 0.931431568,
0.931431568, 0.931431568, 0.931431568, 0.931431568, 0.931431568,
0.931431568, 0.931431568, 0.931431568, 0.931431568, 0.931431568,
0.931431568, 0.931431568, 0.931431568, 0.931431568), X7 = c(0.270710059,
0.277063061, 0.291823581, 0.431113358, 0.165191665, 0.031089322,
0.00946807, 0.040553104, 0.012598261, 0.006557103, 0.008332575,
0.003612478, 0.050244788, 0.387559494, 0.250535044, 0.081293992,
0.179216725, 0.110762938, 0.197073477, 0.275862491), X8 = c(0.000327816,
0.002603413, 0.002742109, 0.004050941, 0.000200038, 0.000115447,
1.27133e-05, 0.000150589, 1.69164e-05, 2.43491e-05, 1.11887e-05,
1.34145e-05, 6.74667e-05, 0.00044596, 0.000278949, 0.000109158,
0.000133225, 0.000148728, 0.0001465, 0.000307149), X9 = c(0.270893931,
0.318213603, 0.391916461, 0.289869936, 0.38006593, 0.011946002,
0.04790828, 0.01161946, 0.046428549, 0.047294196, 0.051217786,
4.59817e-07, 4.59817e-07, 4.59817e-07, 4.59817e-07, 0.283917419,
4.59817e-07, 0.273050148, 4.59817e-07, 4.59817e-07), X10 = c(0.0002414,
0.003316004, 0.004084039, 0.003020644, 0.000338686, 8.42066e-06,
5.44127e-05, 8.19048e-06, 5.27321e-05, 3.33374e-05, 5.81716e-05,
4.80818e-09, 4.80818e-09, 4.80818e-09, 4.80818e-09, 0.000322465,
4.80818e-09, 0.000310122, 4.80818e-09, 4.80818e-09), X11 = c(0L,
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L), X12 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X13 = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L), X14 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L), X15 = c(7L,
4L, 2L, 2L, 3L, 25L, 12L, 24L, 15L, 22L, 10L, 18L, 10L, 9L,
9L, 12L, 15L, 15L, 20L, 8L), X16 = c(0.579324474, 0.67828913,
0.797352123, 0.65619424, 0.643443626, 1.362389303, 1.076926757,
1.158663556, 1.326649492, 1.070007097, 1.263384865, 0.976581992,
1.299930534, 1.665132357, 1.202016572, 1.076926757, 1.111254141,
1.326649492, 0.885775977, 1.319718044), X17 = c(13.07130455,
8.316492188, 8.766790769, 9.539716667, 12.43554545, 8.205709375,
11.61956875, 9.233861538, 11.37893538, 10.52281061, 11.20804697,
11.50165758, 12.36111364, 13.1170197, 16.01679545, 11.61956875,
16.46697656, 11.37893538, 16.98400462, 15.10651515)), row.names = c("296- 12/31/2000",
"296-3/31/1998", "296-6/30/1998", "296-9/30/1998", "296-9/30/2000",
"10718-3/31/2010", "10718-3/31/2011", "10718-6/30/2010", "10718-6/30/2011",
"10718-9/30/2010", "10718-9/30/2011", "52239-12/31/2010", "52239-12/31/2011",
52239-12/31/2013", "52239-12/31/2014", "52239-3/31/2011", "52239-3/31/2015",
"52239-6/30/2011", "52239-6/30/2015", "52239-9/30/2014"), class = c("pdata.frame",
"data.frame"), index = structure(list(Firm = structure(c(1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("296", "10718", "52239"), class = "factor"),
Date = structure(c(1L, 6L, 10L, 14L, 15L, 7L, 8L, 11L, 12L,
16L, 17L, 2L, 3L, 4L, 5L, 8L, 9L, 12L, 13L, 18L), .Label = c("12/31/2000",
"12/31/2010", "12/31/2011", "12/31/2013", "12/31/2014", "3/31/1998",
"3/31/2010", "3/31/2011", "3/31/2015", "6/30/1998", "6/30/2010",
"6/30/2011", "6/30/2015", "9/30/1998", "9/30/2000", "9/30/2010",
"9/30/2011", "9/30/2014"), class = "factor")), row.names = c(220L,
7L, 11L, 13L, 196L, 1713L, 2161L, 1816L, 2274L, 1930L, 2379L,
2052L, 2504L, 2983L, 3278L, 2162L, 3413L, 2275L, 3554L, 3121L
), class = c("pindex", "data.frame")))
The first 20 lines (rows) of DATA are as below. This is after I put "pdata.frame" so the first column as been added automatically accordingly:
DATA=pdata.frame(data,index=c("Firm","Date"))
DATA[1:20,]
Date Firm Country X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17
296-12/31/2000 12/31/2000 296 30 3.28071e+11 420.90 1.3970e+09 41.6563 0 0.9314316 0.270710059 0.0003278160 2.708939e-01 2.414000e-04 0 0 0 0 7 0.5793245 13.071305
296-3/31/1998 3/31/1998 296 30 2.47603e+11 109.62 6.8260e+09 21.4688 0 0.9314316 0.277063061 0.0026034130 3.182136e-01 3.316004e-03 0 0 0 0 4 0.6782891 8.316492
296-6/30/1998 6/30/1998 296 30 2.68036e+11 115.46 8.4070e+09 29.8125 0 0.9314316 0.291823581 0.0027421090 3.919165e-01 4.084039e-03 0 0 0 0 2 0.7973521 8.766791
296-9/30/1998 9/30/1998 296 30 2.88430e+11 170.57 6.2180e+09 37.0938 0 0.9314316 0.431113358 0.0040509410 2.898699e-01 3.020644e-03 1 0 0 0 2 0.6561942 9.539717
296-9/30/2000 9/30/2000 296 30 3.12936e+11 256.84 1.9600e+09 33.6875 0 0.9314316 0.165191665 0.0002000380 3.800659e-01 3.386860e-04 0 0 0 0 3 0.6434436 12.435545
10718-3/31/2010 3/31/2010 10718 30 1.63006e+11 245.79 4.3900e+08 28.2500 0 0.9314316 0.031089322 0.0001154470 1.194600e-02 8.420660e-06 0 0 0 1 25 1.3623893 8.205709
10718-3/31/2011 3/31/2011 10718 30 1.62064e+11 28.10 3.0110e+09 30.8800 0 0.9314316 0.009468070 0.0000127133 4.790828e-02 5.441270e-05 0 0 0 1 12 1.0769268 11.619569
10718-6/30/2010 6/30/2010 10718 30 1.62003e+11 320.61 4.2700e+08 29.3100 0 0.9314316 0.040553104 0.0001505890 1.161946e-02 8.190480e-06 0 0 0 1 24 1.1586636 9.233862
10718-6/30/2011 6/30/2011 10718 30 1.75994e+11 37.39 2.9180e+09 24.6900 0 0.9314316 0.012598261 0.0000169164 4.642855e-02 5.273210e-05 0 0 0 1 15 1.3266495 11.378935
10718-9/30/2010 9/30/2010 10718 30 1.66539e+11 51.84 1.7380e+09 28.9900 0 0.9314316 0.006557103 0.0000243491 4.729420e-02 3.333740e-05 0 0 0 1 22 1.0700071 10.522811
10718-9/30/2011 9/30/2011 10718 30 1.90875e+11 24.73 3.2190e+09 26.1300 0 0.9314316 0.008332575 0.0000111887 5.121779e-02 5.817160e-05 0 0 0 1 10 1.2633849 11.208047
52239-12/31/2010 12/31/2010 52239 30 8.48942e+11 28.56 2.0000e+05 168.8400 0 0.9314316 0.003612478 0.0000134145 4.598170e-07 4.808180e-09 0 0 0 1 18 0.9765820 11.501658
52239-12/31/2011 12/31/2011 52239 30 9.11332e+11 149.12 2.0000e+05 168.1600 0 0.9314316 0.050244788 0.0000674667 4.598170e-07 4.808180e-09 0 0 0 1 10 1.2999305 12.361114
52239-12/31/2013 12/31/2013 52239 30 9.38555e+11 176.70 2.0000e+05 127.5600 0 0.9314316 0.387559494 0.0004459600 4.598170e-07 4.808180e-09 0 0 0 0 9 1.6651324 13.117020
52239-12/31/2014 12/31/2014 52239 30 9.11507e+11 204.86 2.0000e+05 177.2600 0 0.9314316 0.250535044 0.0002789490 4.598170e-07 4.808180e-09 0 0 0 0 9 1.2020166 16.016795
52239-3/31/2011 3/31/2011 52239 30 8.80528e+11 241.27 1.7844e+10 170.6300 0 0.9314316 0.081293992 0.0001091580 2.839174e-01 3.224650e-04 0 0 0 1 12 1.0769268 11.619569
52239-3/31/2015 3/31/2015 52239 30 9.15665e+11 245.69 2.0000e+05 163.8500 0 0.9314316 0.179216725 0.0001332250 4.598170e-07 4.808180e-09 0 0 0 0 15 1.1112541 16.466977
52239-6/30/2011 6/30/2011 52239 30 8.83188e+11 328.73 1.7161e+10 131.2700 0 0.9314316 0.110762938 0.0001487280 2.730501e-01 3.101220e-04 0 0 0 1 15 1.3266495 11.378935
52239-6/30/2015 6/30/2015 52239 30 8.59914e+11 270.17 2.0000e+05 167.4400 0 0.9314316 0.197073477 0.0001465000 4.598170e-07 4.808180e-09 0 0 0 0 20 0.8857760 16.984005
52239-9/30/2014 9/30/2014 52239 30 9.23223e+11 225.57 2.0000e+05 158.2100 0 0.9314316 0.275862491 0.0003071490 4.598170e-07 4.808180e-09 0 0 0 0 8 1.3197180 15.106515
>
Upvotes: 0
Views: 1041
Reputation: 41240
From the definition of is.pbalanced
:
Balanced data are data for which each individual has the same time periods
As an example:
library(plm)
data("Grunfeld", package = "plm")
is.pbalanced(Grunfeld)
#> [1] TRUE
table(Grunfeld$firm,Grunfeld$year)
#>
#> 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949
#> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 10 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
non.balanced <- Grunfeld[-sample(200,10),]
is.pbalanced(non.balanced)
#> [1] FALSE
table(non.balanced$firm,non.balanced$year)
#> 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949
#> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 2 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1
#> 3 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1
#> 4 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1
#> 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> 8 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1
#> 9 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1
#> 10 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1
As shown above, table
of a pbalanced dataset doesn't have zeros : the periods are the same for every firm.
You can verify this in pbalanced
source code:
is.pbalanced.default <- function(x, y, ...) {
if (length(x) != length(y)) stop("The length of the two vectors differs\n")
x <- x[drop = TRUE] # drop unused factor levels so that table
y <- y[drop = TRUE] # gives only needed combinations
z <- table(x, y)
if (any(v <- as.vector(z) == 0L)) {
balanced <- FALSE # Any zero means False
} else { balanced <- TRUE
table
of the dataset you're using has many zeroes, which explains why is.pbalanced(DATA)==FALSE
It would be useful to provide dput(data)
to find out why make.pbalanced
doesn't work.
Upvotes: 1