Reputation: 107
I am trying to get the stargazer
R package to spit out a huge .csv. descriptive statistics table for 43 different companies into Excel.
I have 43 different data frames pertaining meetings at 43 different corporation. I am currently trying to create an output from each one of these data frames to list Mean Value of variable of Meetings, Standard Deviation of Variable August, Max, Min, and the total Number. The reproducible example is below for company 1 with "J" standing for July, "A" for August, "S" for September, and "N" for November:
J A S N
29 2 30 2
20 3 30 10
30 10 20 30
40 20 10 21
10 30 13 14
Company 2 is
J A S N
14 5 15 19
10 10 17 10
30 14 18 11
10 20 19 9
10 10 20 4
The following is the code I used:
stargazer(Company1[c("J", "A", "S", "N")], type = "text",
title="Descriptive statistics Company 1", digits=1, out="table1.txt",
covariate.labels= c("J", "A", "S", "N"))
stargazer(Company2[c("J", "A", "S", "N")], type = "text",
title="Descriptive statistics AugustMartin", digits=1, out="table1.txt",
covariate.labels= c("J", "A", "S", "N"))
...
stargazer(Company43[c("J", "A", "S", "N")], type = "text",
title="Descriptive statistics AugustMartin", digits=1, out="table1.txt",
covariate.labels= c("J", "A", "S", "N"))
I am trying to get the stargazer package to take All Descriptive Statistics from Company 1 to Company 43, preserve the headings so that I know which descriptive statistics belong to which company and dump them into a convenient CSV sheet in excel.
When I try to use the following code
library(xlsx)
# Create a single summary table and write to an Excel file
tab1 = t(sapply(Company1, function(x) {
data.frame(N=length(x), Mean=mean(x), `St. Dev.`=sd(x), Min=min(x),
Max=max(x))
}))
write.xlsx(tab1, "Company Data.xlsx", sheetName="Company1", row.names=FALSE)
I get the following:
Col 1 Col2 Col3 Col4
51.34 #N/A 51.34 51.34
7.58 #N/A 7.58 7.58
26.99 #N/A 26.99 26.99
49.74 #N/A 49.74 49.74
34.84 #N/A 34.84 34.84
54.84 #N/A 54.84 54.84
18.5 #N/A 18.5 18.5
49.34 #N/A 49.34 49.34
49.26 #N/A 49.26 49.26
It is taking the mean of each row and not the mean of the specified column. ideally it would add up everything in Col 1, find the mean, the standard deviation, the min and the max.
Upvotes: 2
Views: 10924
Reputation: 341
It you don't need to automate the process you could do:
stargazer(., type = "html")
Where .
are the other arguments.
Then you can compile the table (like for example here) and just copy-paste into Excel. Just be sure to format all cells as text and paste without formatting to prevent Excel from altering the information on the table.
Upvotes: 2
Reputation: 93851
stargazer
doesn't seem like the right tool for this. Instead, you could just create a data frame with the summary data you want and then write that to an Excel file. Below I provide examples for a single data frame and for a bunch of data frames.
Save a summary of a single data frame
library(xlsx)
# Create a single summary table and write to an Excel file
tab1 = t(sapply(Company1, function(x) {
data.frame(N=length(x), Mean=mean(x), `St. Dev.`=sd(x), Min=min(x), Max=max(x))
}))
write.xlsx(tab1, "Company Data.xlsx", sheetName="Company1", row.names=FALSE)
Save summaries of multiple data frames
Here's how to write a bunch of summary tables to a file. Ideally, you would read your data frames into a list and then operate on each element of the list. That way you won't have to repeat the same code for each individual data frame.
To create a list of data frames, you would read in your data something like this:
# Get names of files to read
file.names = list.files(pattern="Company.*csv")
df.list = sapply(file.names, read.csv)
For this example, I'll create a list of data frames using the sample data for Company1
and Company2
that you provided:
df.list = list(Company1=Company1, Company2=Company2)
# Calculate summary statistics for each data frame and write to an Excel worksheet
sapply(names(df.list), function(df) {
tab1 = t(sapply(df.list[[df]], function(x) {
data.frame(N=length(x), Mean=mean(x), `St. Dev.`=sd(x), Min=min(x), Max=max(x))
}))
write.xlsx(tab1, "Company Data.xlsx", sheetName=df, row.names=FALSE, append=TRUE)
})
Upvotes: 2