emilBeBri
emilBeBri

Reputation: 666

Stata putexcel summary statistics by group to MS Excel

I'm trying to get the Stata command putexcel to give me summary statistics for a continuous variable, grouped by a categorical variable, one after another, in the same worksheet. This should be repeated/looped through a number of years, where each year has its own sheet.

This poses two problems:

  1. using bysort doesn't record all the groups (well maybe it does but I don't understand how to retrieve them), so it seems I have to use an if-condition for each level. Therefore, the problems becomes that:

  2. There are a 150 categories (groups) in my categorical variable, all semi-random 4-digit numbers. So ideally, the solution would automatically detect the number of levels in the group-variabel instead of me writing in hand the 150 different categories for each if-statement.

Here is an example dataset:

clear
input ///
id income1996 income1997 employcode1996 employcode1997
1 500 400 3300 5000
2 500 300 5000 5000
3 900 1050 8830 8220
4 1000 1200 8220 3300
5 600 900 5000 8220
6 200 100 8220 5000
7 700 100 3300 3300
8 1000 100 5000 5000
end

And here is my not very good try to resolve the problem. I know the local variables are just as (in)efficient as just writing it in hand, but it was my best shot.

forval x=1996/1997 {
    local y=2
    local z=`y'+1
    local w=`y'+2   
    summarize income`x' if employcode`x'==3300
    putexcel A1=rnames A`z'=rscalars  using "C:\Users\emilbebri\Downloads\tmp\results.xlsx", sheet(year`x') modify colwise
    summarize income`x' if employcode`x'==5000 
    putexcel A`z'=rscalars  using "C:\Users\emilbebri\Downloads\tmp\results.xlsx", sheet(year`x') modify colwise
    summarize income`x' if employcode`x'==8220 
    putexcel A`w'=rscalars  using "C:\Users\emilbebri\Downloads\tmp\results.xlsx", sheet(year`x') modify colwise  
}

Would appreciate an answer greatly, and so would my rmi-strained righthand! This guy seems to be on to something similar, however, the actual content is too far off and I am at loss as how I would transfer that knowledge to my similar-but-kind-of-different problem.

Update: here is Robertos answer, but modified so that the output becomes more compact, like this: (the reason why the last row doesn't have mean and SD is because the example data has only one observation in that category)

enter image description here

and here is the code to produce it.

forvalues x = 1996/1997 {

    local xlsrow = 2

    quietly levelsof employcode`x', local(ecodes)
    foreach ecode of local ecodes {

        // show on screen
        quietly display "Year `x', code `ecode'"
        quietly summarize income`x' if employcode`x' == `ecode'
        quietly display ""

        // save to MS Excel
        putexcel A`xlsrow'=("Code `ecode'") B`xlsrow'=rscalars ///
            A1=("discokode") B1=rnames  ///
            using "C:\Users\emilbebri\Downloads\tmp\results11.xlsx", ///
            sheet(`x') modify colwise

        // update MS Excel row
        local xlsrow = `xlsrow' + 1

    }

}

Upvotes: 1

Views: 4559

Answers (1)

Roberto Ferrer
Roberto Ferrer

Reputation: 11102

In your updated code you are missing {} for the forvalues loop. Also, you don't make use of the local employcode_tmp, and that seems to be what you aim for.

Fixing the syntax errors I mention and deleting your second quietly should give you some output. However, your loop gives repeated results (five for each employment code). I'm not sure that's intentional.

A complete working example, with my interpretation of what you want, is

clear
set more off

*----- example data -----

input ///
id income1996 income1997 employcode1996 employcode1997
1 500 400 3300 5000
2 500 300 5000 5000
3 900 1050 8830 8220
4 1000 1200 8220 3300
5 600 900 5000 8220
6 200 100 8220 5000
7 700 100 3300 3300
8 1000 100 5000 5000
end

*----- what you want -----

forvalues x = 1996/1997 {

    local xlsrow = 1

    quietly levelsof employcode`x', local(ecodes)
    foreach ecode of local ecodes {

        // show on screen
        display "Year `x', code `ecode'"
        summarize income`x' if employcode`x' == `ecode'
        display ""

        // save to MS Excel
        putexcel A`xlsrow'=("Code `ecode'") A`=`xlsrow'+1'=rscalars ///
            using "D:/Datos/rferrer/Desktop/test.xlsx", ///
            sheet(`x') modify colwise

        // update MS Excel row
        local xlsrow = `xlsrow' + 3

    }

}

The result:

enter image description here

Check also help statsby.

Upvotes: 2

Related Questions