My current research requires meta-analytic procedures where variables that contain another variable’s mean come in very handy. Centering Variables is also something very reasonable to do when analysing regressions with an interaction term between a continuous variable and a dummy variable.
Centering variables sounds like an easy task. It is if you use Stata but I found it surprisingly difficult in SPSS (unless you enter the means by hand, which is error-prone and impractible for repeated analyses). Here’s how you can calculated a variable which contains the mean of another variable (which can then easily be centered or used in whatever way one wants to).
Let dres be the variable of interest. The new variable containing the mean of dres (for all obversations) will be named dresavg. I also show how to create a variable containing the number of observations (ntotal). cdres will be the centered variable.
Stata 10
Use
. egen dresavg = mean(dres)
and you’re done! You could also use summarize and generate commands:
. sum dres
. gen dresavg = r(mean)
If you want a variable that contains the total number of observations you can use
. gen ntotal = _N
or with the more flexible egen command (e.g., handy when dres has missings)
. egen ntotal = count(dres)
There are plenty ways to generate various variables containing sample statistics. As for the centered variable, use
. gen cdres = dres - dresavg
or without even generating the variable containing the mean:
. sum dres
. gen cdres = dres - r(mean)
PASW 18 (SPSS, you know)
Beware, long syntax ahead. Before you despair, there’s a simpler (but less flexible) solution below. The complicated approach starts with exporting the variable mean into a new data set. This data set is then merged with the master data set; a variable containing the mean for every observation will be attached. Continue reading ‘Practical tips for statisticians (part 8): centering variables using Stata and SPSS’ »