Sunday, February 21, 2010

Update: Encoding Variables and Labeling Values Consistently (and Efficiently) in Stata

On 2010-02-09,  I posted this example on strategies for labeling values for lots of variables efficiently in Stata.

Today I discovered a function in NJC's -egenmore- extension of -egen- that I think is easier to use and, in many cases, faster to implement than the combination of -labutil- and -multencode- that I had suggested in my Feb 9 posting; so, to extend my previous example, here how we could label those variables using -egen- and the function "ston()" (scroll to the bottom to see the UPDATED code):


*-------------------------------------------------BEGIN CODE
clear
**this first bloc will create a fake dataset, run it all together**
input str12 region regioncode str20 quest1 str20 quest2 str20 quest3
"Southwest" 1 "Strongly Agree" "Strongly Disagree" "Disagree"
"West" 2 "Agree" "Neutral" "Agree"
"North" 3 "Disagree" "Disagree" "Strongly Disagree"
"Northwest" 5 "Disagree" "Agree" "Strongly Agree"
"East" 4 "Strongly Disagree" "Strongly Agree" "Agree"
"South" 9 "Neutral" "Agree" "Agreee"
end
//1. Create labeled REGION variable
/*
If we -encode- region it would not line up with regioncode
because encode operates in alphabetical order, for example:
*/
encode region, gen(region2) label(region2)
fre region2   //<-- these values don't match regioncode
drop region2


/* 
INSTEAD, we use -labmask- to quickly assign the values in 
region to the regioncodes
*/
ssc install labutil
labmask regioncode, values(region)
fre regioncode


//2. Creating comparable survey question scales
/*
We want all the survey questions to be on the same scale 
so that we can compare them in a model or table
-encode- can help us here with quest 1 and 2 because they
have the same categories, but quest3 has different categories 
(it's missing "neutral" and "agree" is spelled differently, so we could
either (1) use replace to define the numeric categoreis for these 
survey questions and then relabel them with -label define- and
-label values-, or (2) use -multencode- after fixing the misspelled 
"agree" value in quest3 
*/
replace quest3 = "Agree" if quest3=="Agreee"
**
ssc install multencode
multencode quest1-quest3, gen(e_quest1-e_quest3)
label li
fre e_*
/* 
The categories are labeled properly, but the scale isn't in
order--we want it to increase in satisfaction as it moves from
1 to 5
*/
 //-labvalch- is also from -labutil-
labvalch quest1, f(1 2 3 4 5) t(4 2 3 5 1)
label li
fre e_*
***UPDATE***
**using egenmore & the "ston()" function:
ssc install egenmore
forval n = 1/3 {
 egen ee_quest`n' = ston(quest`n'), to(1/5) /*
 */ from("Strongly Disagree" Disagree Neutral Agree "Strongly Agree")
 label val ee_quest`n' quest1
 **note: val label "quest1" already defined, if not, 
 **you'll need to define the value labels
 }
li quest1 ee_quest1 
*-------------------------------------------------END CODE

1 comment:

  1. Updating with the latest technology and implementing it is the only way to survive in our niche. Thanks for making me this article. You have done a great job by sharing this content in here. Keep writing article like this.
    SAS Training in Chennai | SAS Course in Chennai

    ReplyDelete