What's the difference between gen and egen in Stata 12?

max picture max · Oct 21, 2012 · Viewed 52.9k times · Source

Is there a reason why there are two different commands to generate a new variable?

Is there a simple way to remember when to use gen and when to use egen?

Answer

griverorz picture griverorz · Oct 21, 2012

They both create a new variable, but work with different sets of functions. You will typically use gen when you have simple transformations of other variables in your dataset like

gen newvar = oldvar1^2 * oldvar2

In my workflow, egen usually appears when I need functions that work across all observations, like in

egen max_var = max(var)

or more complex instructions

egen newvar = rowmax(oldvar1 oldvar2)

to calculate the maximum for each observation between oldvar1 and oldvar2. I don't think there is a clear logic for separating the two commands.