Dleightful
Dleightful

Reputation: 107

Using Stargazer in R to export csv direct into Excel

I am trying to get the stargazer R package to spit out a huge .csv. descriptive statistics table for 43 different companies into Excel.

I have 43 different data frames pertaining meetings at 43 different corporation. I am currently trying to create an output from each one of these data frames to list Mean Value of variable of Meetings, Standard Deviation of Variable August, Max, Min, and the total Number. The reproducible example is below for company 1 with "J" standing for July, "A" for August, "S" for September, and "N" for November:

J   A    S   N
29  2   30   2
20  3   30  10
30  10  20  30
40  20  10  21
10  30  13  14

Company 2 is

J   A   S   N
14  5   15  19
10  10  17  10
30  14  18  11
10  20  19  9
10  10  20  4

The following is the code I used:

stargazer(Company1[c("J", "A", "S", "N")], type = "text", 
      title="Descriptive statistics Company 1", digits=1, out="table1.txt",
      covariate.labels= c("J", "A", "S", "N"))

stargazer(Company2[c("J", "A", "S", "N")], type = "text", 
      title="Descriptive statistics AugustMartin", digits=1, out="table1.txt",
      covariate.labels= c("J", "A", "S", "N"))

...
stargazer(Company43[c("J", "A", "S", "N")], type = "text", 
      title="Descriptive statistics AugustMartin", digits=1, out="table1.txt",
      covariate.labels= c("J", "A", "S", "N"))

I am trying to get the stargazer package to take All Descriptive Statistics from Company 1 to Company 43, preserve the headings so that I know which descriptive statistics belong to which company and dump them into a convenient CSV sheet in excel.

When I try to use the following code

library(xlsx)
# Create a single summary table and write to an Excel file
tab1 = t(sapply(Company1, function(x) {
data.frame(N=length(x), Mean=mean(x), `St. Dev.`=sd(x), Min=min(x),    
Max=max(x))
}))
write.xlsx(tab1, "Company Data.xlsx", sheetName="Company1", row.names=FALSE)

I get the following:

Col 1   Col2     Col3    Col4
51.34   #N/A    51.34   51.34
7.58    #N/A    7.58    7.58
26.99   #N/A    26.99   26.99
49.74   #N/A    49.74   49.74
34.84   #N/A    34.84   34.84
54.84   #N/A    54.84   54.84
18.5    #N/A    18.5    18.5  
49.34   #N/A    49.34   49.34
49.26   #N/A    49.26   49.26

It is taking the mean of each row and not the mean of the specified column. ideally it would add up everything in Col 1, find the mean, the standard deviation, the min and the max.

Upvotes: 2

Views: 10924

Answers (2)

cach dies
cach dies

Reputation: 341

It you don't need to automate the process you could do:

stargazer(., type = "html")

Where . are the other arguments. Then you can compile the table (like for example here) and just copy-paste into Excel. Just be sure to format all cells as text and paste without formatting to prevent Excel from altering the information on the table.

Upvotes: 2

eipi10
eipi10

Reputation: 93851

stargazer doesn't seem like the right tool for this. Instead, you could just create a data frame with the summary data you want and then write that to an Excel file. Below I provide examples for a single data frame and for a bunch of data frames.

Save a summary of a single data frame

library(xlsx)

# Create a single summary table and write to an Excel file
tab1 = t(sapply(Company1, function(x) {
  data.frame(N=length(x), Mean=mean(x), `St. Dev.`=sd(x), Min=min(x), Max=max(x))
}))

write.xlsx(tab1, "Company Data.xlsx", sheetName="Company1", row.names=FALSE)

Save summaries of multiple data frames

Here's how to write a bunch of summary tables to a file. Ideally, you would read your data frames into a list and then operate on each element of the list. That way you won't have to repeat the same code for each individual data frame.

To create a list of data frames, you would read in your data something like this:

# Get names of files to read
file.names = list.files(pattern="Company.*csv")
df.list = sapply(file.names, read.csv)

For this example, I'll create a list of data frames using the sample data for Company1 and Company2 that you provided:

df.list = list(Company1=Company1, Company2=Company2)

# Calculate summary statistics for each data frame and write to an Excel worksheet
sapply(names(df.list), function(df) {

  tab1 = t(sapply(df.list[[df]], function(x) {
    data.frame(N=length(x), Mean=mean(x), `St. Dev.`=sd(x), Min=min(x), Max=max(x))
  }))

  write.xlsx(tab1, "Company Data.xlsx", sheetName=df, row.names=FALSE, append=TRUE)
})

Upvotes: 2

Related Questions