Sunday, March 20, 2011

Some -statplot- examples

-statplot- (co-authored by Nick Cox and myself) was released earlier this month.  You can get it at the SSC [1] [2].
In this posting, I show you some more advanced examples of using -statplot- using the Stata nlsw88 dataset (-sysuse nlsw88.dta-).  [Note: Click on any of the graphs below to see a larger example in a new tab/window.]
First, a basic example of -statplot- might look like:
sysuse nlsw88.dta, clear 
statplot grade tenure wage, blabel(bar) subtit({it:-statplot-} example)
graph export "fig1.png", as(png) replace

Fig. 1
The main advantage of -statplot- is creating plots of summary stats with the labels moved from the legend (the usual placement when using -gr bar|hbar|dot-) to the axis.  So, I could create a graph of the same data above with something like:
graph hbar (mean) grade tenure wage
however, it would look like the graph on the left in Fig. 2 below, where we still have a legend and an array of colors indicating each bar.  I often need to produce these types of graphs but with the labels on the axis (instead of the legend).  To get this type of graph using -graph bar|hbar|dot-, I might run something like:
collapse (mean) grade tenure wage
xpose, clear varn
graph hbar v1, over(_varname)
which does produce something like the -statplot- graph on the right in Figure 2, but in a less-straightfoward way, and in a way that is difficult to extend to other configurations (multiple vars in the varlist, multiple over()or by() categories, etc).  
Figure 2 compares the syntax and output of -graph hbar- and -statplot-:
graph hbar (mean) grade tenure wage,  ///
    name(g1, replace) tit({it:{bf:-graph bar-}})
statplot grade tenure wage,  ///
    name(g2, replace) tit({it:{bf:-statplot-}})
*After running the commands above, compare the graphs with graph combine: 
gr combine g1 g2 

Fig. 2
-graph bar|hbar|dot- does move a one set of labels (the over() variable labels) to the axis when the over() option is utilized; however, I usually like to have these labels all nested on the axis.  Here's a comparison of -graph hbar- and -statplot- when 1 over() option is specified:
graph hbar (mean) grade tenure wage, over(race)  /// 
     name(g1, replace) tit({it:{bf:-graph bar-}})
statplot grade tenure wage, over(race)  /// 
     name(g2, replace) tit({it:{bf:-statplot-}})
gr combine g1 g2
Fig. 3
The user can specify multiple over() options when using -statplot-; however, when more than 1 over() option is used, the first over() variable labels are moved to the legend by default -- in order to move these labels to the axis as well, use -statplot-'s xpose option.  Fig. 4 shows a -statplot- without xpose on the left, and -statplot- with xpose on the right.  
Another option is to specify the legend(off), which makes a call to -graph hbar- (for which -statplot- is a wrapper) to turn off the legend (the labels for this var can still be specified as the bar label (blabel)).  Fig. 5 below shows the same graph with the legend(off) option called.
//FIGURE 4//
**2 over() options
statplot wage, over(race) over(union)  ///
     name(g1, replace)  ///
     tit("2 over opts, w/o {it:xpose} option", size(small) )
**Use the xpose option to move both over() options to the axis:
statplot wage, over(race) over(union) xpose  /// 
     name(g2, replace)   /// 
     tit("2 over opts, w/ {it:xpose} option", size(small) )
gr combine g1 g2
//FIGURE 5//
statplot wage, over(union) over(race)  /// 
     blabel(name, pos(base) color(white))  leg(off)  ///
     tit("2 over opts, w/ {it:legend opt}", size(small) )
Fig. 4


Fig. 5
Using the by() option with -statplot- is another way to get graphs with nested labels without reshaping/collapsing your data.  Fig. 6 shows an example of -statplot- with 1 by() option specified:
statplot wage age ttl_exp, by(union, ///
    tit("Using {it:by()} option"))
Fig. 6

Fig. 7 shows a way to use 1 over() option and 1 by() option with multiple vars in the varlist.
**first create a new categorical var "ttl_exp"**
recode ttl_exp (0/8=0 "zero") (8/15=1 "one") (15/.=3 "three"), g(c_ttl_exp)
statplot wage age, over(c_ttl_exp) by(union,  ///
     tit("1 {it:over} option & 1 {it:by} option", size(medium) ))
Fig. 7

Finally, Fig. 8 shows a more advanced example of working with labels if there are 2 over() options and 1 by() option specified.  You can use the xpose option to put all the labels on the y axis (see the set of plots on the left in Fig. 8), or use the legend(off) (called in the by() option) option to move the label closest to the axis into the bar area as a blabel() (see the set of plots on the right in Fig.8).  
// by() option with 2 over() options //
**Supress legend by using xpose option:
statplot wage, over(c_ttl_exp) over(race) by(union,   /// 
     tit("2 {it:over} options & 1 {it:by} option, using {bf:xpose} opt." /// 
     , size(medsmall)))   b1tit(Avg. of wage) xpose   ///
     name(g1, replace)
**Supress legend by using legend(off) suboption in by():
statplot wage, over(c_ttl_exp) over(race) by(union, leg(off) /// 
tit("2 {it:over} options & 1 {it:by} option, using {bf:legend(off)} opt.", size(medsmall))) ///
     blab(name, pos(base) col(white))   ///
     name(g2, replace)  varopts(label(nolabel)) ///
     b1tit(Avg. of wage)
gr combine g1 g2
Fig. 8

You can download the entire do-file for the graphs above by clicking here.

Also, see Part 2 of this posting where I discuss how to deal with long variable and value labels with -statplot-

No comments:

Post a Comment