I am going (1) to loop a regression over a certain criterion many times; and (2) to store a certain coefficient from each regression. Here is an example:
clear
sysuse auto.dta
local x = 2000
while `x' < 5000 {
xi: regress price mpg length gear_ratio i.foreign if weight < `x'
est sto model_`x'
local x = `x' + 100
}
est dir
I just care about one predictor, say mpg
here. I want to extract coefficients of mpg
from each result into one independent file (any file is OK, .dta
would be great) to see if there is a trend as the threshold for weight
increases. What I am doing now is to useestout
to export the results, something like:
esttab * using test.rtf, replace se stats(r2_a N, labels(R-squared)) starl(* 0.10 ** 0.05 *** 0.01) nogap onecell title(regression tables)
estout
will export everything and I need to edit the results. This works well for regressions with few predictors, but my real dataset has more than 30 variables and the regression will loop at least 100 times (I have a variable Distance
with range from 0 to 30,000: it has the role of weight
in the example). Therefore, it is really difficult for me to edit the results without making mistakes.
Is there any other efficient way to solve my problem? Since my case is not looping over a group variable, but over a certain criterion. the statsby
function seems not working well here.
As @Todd has already suggested, you can just choose the particular results you care about and use postfile
to store them as new variables in a new dataset. Note that a forval
loop is more direct than your while
code, while using xi:
is superseded by factor variable notation in recent versions of Stata. (I have not changed that just in case you are using some older version.) Note evaluation of saved results such as _b[_cons]
on the fly and the use of parentheses ()
to stop negative signs being evaluated. Some code examples elsewhere store results temporarily in local macros or scalars, which is quite unnecessary.
sysuse auto.dta, clear
tempname myresults
postfile `myresults' threshold intercept gradient se using myresults.dta
quietly forval x = 2000(200)4800 {
xi: regress price mpg length gear_ratio i.foreign if weight < `x'
post `myresults' (`x') (`=_b[_cons]') (`=_b[mpg]') (`=_se[mpg]')
}
postclose `myresults'
use myresults
list
+---------------------------------------------+
| thresh~d intercept gradient se |
|---------------------------------------------|
1. | 2000 -3699.55 -296.8218 215.0348 |
2. | 2200 -4175.722 -53.19774 54.51251 |
3. | 2400 -3918.388 -58.83933 42.19707 |
4. | 2600 -6143.622 -58.20153 38.28178 |
5. | 2800 -11159.67 -49.21381 44.82019 |
|---------------------------------------------|
6. | 3000 -6636.524 -51.28141 52.96473 |
7. | 3200 -7410.392 -58.14692 60.55182 |
8. | 3400 -2193.125 -57.89508 52.78178 |
9. | 3600 -1824.281 -103.4387 56.49762 |
10. | 3800 -1192.767 -110.9302 51.6335 |
|---------------------------------------------|
11. | 4000 5649.41 -173.9975 74.51212 |
12. | 4200 5784.363 -147.4454 71.89362 |
13. | 4400 6494.47 -93.81158 80.81586 |
14. | 4600 6494.47 -93.81158 80.81586 |
15. | 4800 5373.041 -95.25342 82.60246 |
+---------------------------------------------+
statsby
(a command, not a function) is just not designed for this problem at all, so it is not a question of whether it works well.