. * do-file for lecture 3b of VHM 812/802, Winter 2025
. version 18 /* works also with versions 14-17 */

. set more off

. set scheme stcolor_alt

. cd "r:\"
r:\

. 
. use daisy2, clear

. * illustration of linearity assessment, using daisy2 data
. * models have terrible residuals but we will ignore that...
. gen milk120k=milk120/1000
(1,288 missing values generated)

. lowess cf milk120k

. * extra code to add 95% range for x ~ Figure 15.3
. centile milk120k, centile(2.5 97.5)

                                                          Binom. interp.   
    Variable |       Obs  Percentile    Centile        [95% conf. interval]
-------------+-------------------------------------------------------------
    milk120k |     8,095        2.5     1.67574        1.643845    1.709898
             |                 97.5     4.48878        4.440558    4.542869

. twoway (scatter cf milk120k, xline(1.67574) xline(4.48878)) (lowess cf milk120k)

. * graphics based on residuals
. regress cf milk120k

      Source |       SS           df       MS      Number of obs   =     7,720
-------------+----------------------------------   F(1, 7718)      =      0.95
       Model |   764.55464         1   764.55464   Prob > F        =    0.3293
    Residual |  6199583.97     7,718  803.263018   R-squared       =    0.0001
-------------+----------------------------------   Adj R-squared   =   -0.0000
       Total |  6200348.53     7,719  803.258003   Root MSE        =    28.342

------------------------------------------------------------------------------
          cf | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    milk120k |  -.4434752   .4545628    -0.98   0.329    -1.334542    .4475914
       _cons |   79.60298   1.379894    57.69   0.000     76.89801    82.30795
------------------------------------------------------------------------------

. predict stdres, rstandard
(1,663 missing values generated)

. lowess stdres milk120k

. * quadratic regression, VER Example 15.3
. regress cf c.milk120k##c.milk120k

      Source |       SS           df       MS      Number of obs   =     7,720
-------------+----------------------------------   F(2, 7717)      =     17.93
       Model |  28681.5187         2  14340.7594   Prob > F        =    0.0000
    Residual |  6171667.01     7,717  799.749515   R-squared       =    0.0046
-------------+----------------------------------   Adj R-squared   =    0.0044
       Total |  6200348.53     7,719  803.258003   Root MSE        =     28.28

---------------------------------------------------------------------------------------
                   cf | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
----------------------+----------------------------------------------------------------
             milk120k |  -16.14046    2.69524    -5.99   0.000    -21.42386   -10.85706
                      |
c.milk120k#c.milk120k |    2.53519    .429095     5.91   0.000     1.694047    3.376333
                      |
                _cons |   102.5711   4.124112    24.87   0.000     94.48672    110.6555
---------------------------------------------------------------------------------------

. * categorisation with lintrend 
. * lintrend is an add-on command, install using: findit lintrend (pick sg50 option)
. lintrend cf milk120k, groups(5) plot(mean) // does not support linear relation
  
  
The mean of cf by categories of milk120k 
  
  (Note: 5 milk120k categories of equal sample size;
     Uses mean milk120k value for each category)

  +--------------------------------------------+
  | milk120k      min      max   total      cf |
  |--------------------------------------------|
  |      2.0     .989     2.35    1620   76.70 |
  |      2.6   2.3507   2.7474    1618   74.69 |
  |      2.9   2.7476   3.0963    1620   73.93 |
  |      3.3   3.0964   3.5314    1619   74.06 |
  |      4.0   3.5317   6.7128    1618   73.96 |
  +--------------------------------------------+

. * box-cox analysis for transforming x
. boxcox cf milk120k, model(rhs)

Fitting full model

Iteration 0:  Log likelihood = -36771.519  (not concave)
Iteration 1:  Log likelihood = -36770.154  (not concave)
Iteration 2:  Log likelihood = -36768.274  (not concave)
Iteration 3:  Log likelihood = -36758.817  
Iteration 4:  Log likelihood = -36758.585  
Iteration 5:  Log likelihood = -36757.787  
Iteration 6:  Log likelihood = -36757.787  
(1,288 missing values generated)
(1,288 missing values generated)
(1,288 missing values generated)

                                                  Number of obs   =      7,720
                                                  LR chi2(2)      =      28.42
Log likelihood = -36757.787                       Prob > chi2     =      0.000
 
------------------------------------------------------------------------------
          cf | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
     /lambda |  -4.119009   .6687904    -6.16   0.000    -5.429814   -2.808204
------------------------------------------------------------------------------
 
Estimates of scale-variant parameters
----------------------------
             | Coefficient
-------------+--------------
Notrans      |
       _cons |   116.5945
-------------+--------------
Trans        |
    milk120k |   -161.589
-------------+--------------
      /sigma |   28.28786
----------------------------

---------------------------------------------------------
   Test         Restricted     LR statistic
    H0:       log likelihood       chi2       Prob > chi2
---------------------------------------------------------
lambda = -1     -36766.658        17.74           0.000
lambda =  0     -36769.777        23.98           0.000
lambda =  1     -36771.519        27.46           0.000
---------------------------------------------------------

. generate milk120k_pow=-1/milk120k^4 // minus to keep order for x
(1,288 missing values generated)

. regress cf milk120k_pow

      Source |       SS           df       MS      Number of obs   =     7,720
-------------+----------------------------------   F(1, 7718)      =     28.43
       Model |   22754.594         1   22754.594   Prob > F        =    0.0000
    Residual |  6177593.93     7,718  800.413829   R-squared       =    0.0037
-------------+----------------------------------   Adj R-squared   =    0.0035
       Total |  6200348.53     7,719  803.258003   Root MSE        =    28.292

------------------------------------------------------------------------------
          cf | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
milk120k_pow |  -38.06757   7.139665    -5.33   0.000    -52.06325   -24.07188
       _cons |   77.30789   .3713342   208.19   0.000     76.57997     78.0358
------------------------------------------------------------------------------

. predict stdres2, rstandard
(1,663 missing values generated)

. lowess stdres2 milk120k_pow

. * improvement over linear term alone, but not over quadratic model
. 
. * fractional polynomials, VER Example 15.4
. fp <milk120k>: regress cf <milk120k>
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
------------------------------------------------------------------------------
             | Test              Residual   Deviance
    milk120k |   df   Deviance   std. dev.      diff.       P   Powers
-------------+----------------------------------------------------------------
     omitted |    4   73543.99     28.342     45.690    0.000               
      linear |    3   73543.04     28.342     44.738    0.000   1           
       m = 1 |    2   73525.42     28.310     27.118    0.000   -2          
       m = 2 |    0   73498.30     28.262      0.000       --   -.5 0       
------------------------------------------------------------------------------
Note: Test df is degrees of freedom, and P = P > F is sig. level for tests
      comparing models vs. model with m = 2 based on deviance difference,
      F(df, 7715).

      Source |       SS           df       MS      Number of obs   =     7,720
-------------+----------------------------------   F(2, 7717)      =     22.90
       Model |  36587.3332         2  18293.6666   Prob > F        =    0.0000
    Residual |  6163761.19     7,717  798.725048   R-squared       =    0.0059
-------------+----------------------------------   Adj R-squared   =    0.0056
       Total |  6200348.53     7,719  803.258003   Root MSE        =    28.262

------------------------------------------------------------------------------
          cf | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  milk120k_1 |   295.8967   46.01638     6.43   0.000     205.6922    386.1013
  milk120k_2 |   87.36896   14.07522     6.21   0.000     59.77771    114.9602
       _cons |  -189.8411   42.17274    -4.50   0.000    -272.5111   -107.1711
------------------------------------------------------------------------------

. fp plot, r(none) // estimated curve with 95% CI ~ Figure 15.7

. fp plot, r(residuals) // estimated curve with 95% CI and observed values

. fp plot, r(rstandard) // adding the standardised residuals does not make much sense

. br cf milk120k milk120k_1 milk120k_2

. di "-0.5 term:  " 1/sqrt(3.5058) " and 0 term: "ln(3.5058)
-0.5 term:  .53408014 and 0 term: 1.2544187

. * try with larger class of models
. capture drop milk120k_1 milk120k_2 // fp will not override these variables, unless replace option is us
> ed

. fp <milk120k>, powers(-4 -3 -2 -1 -.5 0 .5 1 2 3) dimension(3) : regress cf <milk120k>
(fitting 285 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
------------------------------------------------------------------------------
             | Test              Residual   Deviance
    milk120k |   df   Deviance   std. dev.      diff.       P   Powers
-------------+----------------------------------------------------------------
     omitted |    6   73543.99     28.342     45.741    0.000               
      linear |    5   73543.04     28.342     44.789    0.000   1           
       m = 1 |    4   73515.61     28.292     17.358    0.002   -4          
       m = 2 |    2   73498.30     28.262      0.052    0.974   -.5 0       
       m = 3 |    0   73498.25     28.263      0.000       --   .5 .5 .5    
------------------------------------------------------------------------------
Note: Test df is degrees of freedom, and P = P > F is sig. level for tests
      comparing models vs. model with m = 3 based on deviance difference,
      F(df, 7713).

      Source |       SS           df       MS      Number of obs   =     7,720
-------------+----------------------------------   F(3, 7716)      =     15.28
       Model |  36628.6805         3  12209.5602   Prob > F        =    0.0000
    Residual |  6163719.85     7,716  798.823204   R-squared       =    0.0059
-------------+----------------------------------   Adj R-squared   =    0.0055
       Total |  6200348.53     7,719  803.258003   Root MSE        =    28.263

------------------------------------------------------------------------------
          cf | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  milk120k_1 |  -730.7391   394.4714    -1.85   0.064     -1504.01    42.53202
  milk120k_2 |   313.1601   182.0551     1.72   0.085    -43.71734    670.0375
  milk120k_3 |   -42.3349   30.02417    -1.41   0.159    -101.1904    16.52062
       _cons |   835.2254   399.5796     2.09   0.037     51.94089     1618.51
------------------------------------------------------------------------------

. * syntax for multivariable models
. capture drop milk120k_1 milk120k_2 milk120k_3

. fp <milk120k>: regress cf <milk120k> i.twin
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
------------------------------------------------------------------------------
             | Test              Residual   Deviance
    milk120k |   df   Deviance   std. dev.      diff.       P   Powers
-------------+----------------------------------------------------------------
     omitted |    4   73541.35     28.339     46.253    0.000               
      linear |    3   73540.28     28.339     45.192    0.000   1           
       m = 1 |    2   73522.22     28.306     27.124    0.000   -2          
       m = 2 |    0   73495.09     28.258      0.000       --   -.5 0       
------------------------------------------------------------------------------
Note: Test df is degrees of freedom, and P = P > F is sig. level for tests
      comparing models vs. model with m = 2 based on deviance difference,
      F(df, 7714).

      Source |       SS           df       MS      Number of obs   =     7,720
-------------+----------------------------------   F(3, 7716)      =     16.34
       Model |   39147.274         3  13049.0913   Prob > F        =    0.0000
    Residual |  6161201.25     7,716  798.496793   R-squared       =    0.0063
-------------+----------------------------------   Adj R-squared   =    0.0059
       Total |  6200348.53     7,719  803.258003   Root MSE        =    28.258

------------------------------------------------------------------------------
          cf | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  milk120k_1 |   296.9481   46.01355     6.45   0.000     206.7491    387.1472
  milk120k_2 |    87.6043   14.07382     6.22   0.000     60.01579    115.1928
             |
        twin |
        yes  |   4.259369   2.378849     1.79   0.073     -.403821    8.922559
       _cons |  -190.7942   42.17007    -4.52   0.000     -273.459   -108.1294
------------------------------------------------------------------------------

. 
. * reliability, illustrated with Coleman data
. use coleman, clear

. regress y x1-x5

      Source |       SS           df       MS      Number of obs   =        20
-------------+----------------------------------   F(5, 14)        =     27.08
       Model |  582.686401         5   116.53728   Prob > F        =    0.0000
    Residual |  60.2378924        14   4.3027066   R-squared       =    0.9063
-------------+----------------------------------   Adj R-squared   =    0.8728
       Total |  642.924294        19  33.8381207   Root MSE        =    2.0743

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x1 |  -1.793333   1.233397    -1.45   0.168    -4.438706    .8520392
          x2 |   .0436015   .0532589     0.82   0.427    -.0706275    .1578304
          x3 |   .5557601   .0929564     5.98   0.000     .3563884    .7551318
          x4 |   1.110168   .4337681     2.56   0.023     .1798279    2.040508
          x5 |  -1.810919    2.02739    -0.89   0.387    -6.159239      2.5374
       _cons |   19.94857   13.62755     1.46   0.165    -9.279627    49.17676
------------------------------------------------------------------------------

. predict res, residual

. predict lev, hat

. generate temp=(res/(1-lev))^2

. total temp // PRESS statistic

Total estimation                            Number of obs = 20

--------------------------------------------------------------
             |      Total   Std. err.     [95% conf. interval]
-------------+------------------------------------------------
        temp |   117.7791   56.41735     -.3037369     235.862
--------------------------------------------------------------

. display "Predictive R^2 = " 1- 117.7791/642.924294
Predictive R^2 = .8168072

. * drop in R^2 from 90.6% to 81.7% ~ substantial (9%) shrinkage
. 
. * model fit statistics
. regress y x1-x5

      Source |       SS           df       MS      Number of obs   =        20
-------------+----------------------------------   F(5, 14)        =     27.08
       Model |  582.686401         5   116.53728   Prob > F        =    0.0000
    Residual |  60.2378924        14   4.3027066   R-squared       =    0.9063
-------------+----------------------------------   Adj R-squared   =    0.8728
       Total |  642.924294        19  33.8381207   Root MSE        =    2.0743

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x1 |  -1.793333   1.233397    -1.45   0.168    -4.438706    .8520392
          x2 |   .0436015   .0532589     0.82   0.427    -.0706275    .1578304
          x3 |   .5557601   .0929564     5.98   0.000     .3563884    .7551318
          x4 |   1.110168   .4337681     2.56   0.023     .1798279    2.040508
          x5 |  -1.810919    2.02739    -0.89   0.387    -6.159239      2.5374
       _cons |   19.94857   13.62755     1.46   0.165    -9.279627    49.17676
------------------------------------------------------------------------------

. estat ic

Akaike's information criterion and Bayesian information criterion

-----------------------------------------------------------------------------
       Model |          N   ll(null)  ll(model)      df        AIC        BIC
-------------+---------------------------------------------------------------
           . |         20  -63.08172  -39.40446       6   90.80893   96.78332
-----------------------------------------------------------------------------
Note: BIC uses N = number of observations. See [R] IC note.

. regress y x3 x4

      Source |       SS           df       MS      Number of obs   =        20
-------------+----------------------------------   F(2, 17)        =     66.95
       Model |  570.497872         2  285.248936   Prob > F        =    0.0000
    Residual |  72.4264222        17  4.26037777   R-squared       =    0.8873
-------------+----------------------------------   Adj R-squared   =    0.8741
       Total |  642.924294        19  33.8381207   Root MSE        =    2.0641

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x3 |   .5415607   .0500441    10.82   0.000     .4359768    .6471445
          x4 |   .7498915   .3666402     2.05   0.057    -.0236517    1.523435
       _cons |    14.5827   9.175409     1.59   0.130    -4.775723    33.94112
------------------------------------------------------------------------------

. estat ic

Akaike's information criterion and Bayesian information criterion

-----------------------------------------------------------------------------
       Model |          N   ll(null)  ll(model)      df        AIC        BIC
-------------+---------------------------------------------------------------
           . |         20  -63.08172  -41.24716       3   88.49432   91.48152
-----------------------------------------------------------------------------
Note: BIC uses N = number of observations. See [R] IC note.

. 
. * comparison of predictor effects
. centile x1-x5, centile(25 75)

                                                          Binom. interp.   
    Variable |       Obs  Percentile    Centile        [95% conf. interval]
-------------+-------------------------------------------------------------
          x1 |        20         25      2.4425        2.091335    2.608275
             |                   75       3.025        2.692345    3.535893
          x2 |        20         25       17.55        11.60616    29.29372
             |                   75     67.7025        36.25481    77.36343
          x3 |        20         25      -.7625       -12.84819    3.901728
             |                   75      11.895        6.221725     14.2645
          x4 |        20         25       24.51        23.51092       25.01
             |                   75        25.7        25.08819        26.6
          x5 |        20         25        5.66        5.342362     6.17354
             |                   75        6.92         6.24761    7.099384

. regress y x1-x5

      Source |       SS           df       MS      Number of obs   =        20
-------------+----------------------------------   F(5, 14)        =     27.08
       Model |  582.686401         5   116.53728   Prob > F        =    0.0000
    Residual |  60.2378924        14   4.3027066   R-squared       =    0.9063
-------------+----------------------------------   Adj R-squared   =    0.8728
       Total |  642.924294        19  33.8381207   Root MSE        =    2.0743

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x1 |  -1.793333   1.233397    -1.45   0.168    -4.438706    .8520392
          x2 |   .0436015   .0532589     0.82   0.427    -.0706275    .1578304
          x3 |   .5557601   .0929564     5.98   0.000     .3563884    .7551318
          x4 |   1.110168   .4337681     2.56   0.023     .1798279    2.040508
          x5 |  -1.810919    2.02739    -0.89   0.387    -6.159239      2.5374
       _cons |   19.94857   13.62755     1.46   0.165    -9.279627    49.17676
------------------------------------------------------------------------------

. lincom x1*(3.025-2.4425)

 ( 1)  .5825*x1 = 0

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |  -1.044617   .7184535    -1.45   0.168    -2.585546    .4963128
------------------------------------------------------------------------------

. lincom x2*(67.7025-17.55)

 ( 1)  50.1525*x2 = 0

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   2.186723   2.671067     0.82   0.427    -3.542145    7.915591
------------------------------------------------------------------------------

. lincom x3*(11.895+0.7625)

 ( 1)  12.6575*x3 = 0

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   7.034533   1.176596     5.98   0.000     4.510986     9.55808
------------------------------------------------------------------------------

. lincom x4*(25.7-24.51)

 ( 1)  1.19*x4 = 0

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |     1.3211    .516184     2.56   0.023     .2139952    2.428204
------------------------------------------------------------------------------

. lincom x5*(6.92-5.66)

 ( 1)  1.26*x5 = 0

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |  -2.281758   2.554512    -0.89   0.387    -7.760641    3.197124
------------------------------------------------------------------------------

. * x3 has by far the largest effect on the outcome (also most significant, but a different thing)
. 
. * automated variable selection procedures
. use coleman, clear

. * forward selection with significance level 0.05
. stepwise, pe(0.05): regress y x1-x5

Wald test, begin with empty model:
p = 0.0000 <  0.0500, adding x3

      Source |       SS           df       MS      Number of obs   =        20
-------------+----------------------------------   F(1, 18)        =    110.23
       Model |  552.675528         1  552.675528   Prob > F        =    0.0000
    Residual |  90.2487661        18  5.01382034   R-squared       =    0.8596
-------------+----------------------------------   Adj R-squared   =    0.8518
       Total |  642.924294        19  33.8381207   Root MSE        =    2.2392

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x3 |   .5603254   .0533691    10.50   0.000     .4482012    .6724497
       _cons |    33.3228   .5279987    63.11   0.000     32.21351    34.43208
------------------------------------------------------------------------------

. * backward elimination with significance level 0.05
. stepwise, pr(0.05): regress y x1-x5

Wald test, begin with full model:
p = 0.4267 >= 0.0500, removing x2
p = 0.6863 >= 0.0500, removing x5
p = 0.1616 >= 0.0500, removing x1
p = 0.0566 >= 0.0500, removing x4

      Source |       SS           df       MS      Number of obs   =        20
-------------+----------------------------------   F(1, 18)        =    110.23
       Model |  552.675528         1  552.675528   Prob > F        =    0.0000
    Residual |  90.2487661        18  5.01382034   R-squared       =    0.8596
-------------+----------------------------------   Adj R-squared   =    0.8518
       Total |  642.924294        19  33.8381207   Root MSE        =    2.2392

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x3 |   .5603254   .0533691    10.50   0.000     .4482012    .6724497
       _cons |    33.3228   .5279987    63.11   0.000     32.21351    34.43208
------------------------------------------------------------------------------

. * stepwise approach with significance level 0.05, starting from maximal model
. stepwise, pe(0.049999) pr(0.05): regress y x1-x5 // note: pe<pr

Wald test, begin with full model:
p = 0.4267 >= 0.0500, removing x2
p = 0.6863 >= 0.0500, removing x5
p = 0.1616 >= 0.0500, removing x1
p = 0.0566 >= 0.0500, removing x4

      Source |       SS           df       MS      Number of obs   =        20
-------------+----------------------------------   F(1, 18)        =    110.23
       Model |  552.675528         1  552.675528   Prob > F        =    0.0000
    Residual |  90.2487661        18  5.01382034   R-squared       =    0.8596
-------------+----------------------------------   Adj R-squared   =    0.8518
       Total |  642.924294        19  33.8381207   Root MSE        =    2.2392

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x3 |   .5603254   .0533691    10.50   0.000     .4482012    .6724497
       _cons |    33.3228   .5279987    63.11   0.000     32.21351    34.43208
------------------------------------------------------------------------------

. * all algorithms lead to same model (not always the case!)
. 
. use daisy2red, clear

. gen calv_mth=month(calv_dt)

. gen aut_calv=(calv_mth>=2 & calv_mth<=7) if !missing(calv_mth)

. gen rootwpc=sqrt(wpc)

. stepwise, pe(0.049999) pr(0.05): regress rootwpc parity c.herd_size##c.herd_size c.mwp i.twin##i.dyst i
> .rp##vag_disch // resulting model not meaningful
note: 0b.twin omitted because of estimability.
note: 0b.dyst omitted because of estimability.
note: 0o.twin#0o.dyst omitted because of estimability.
note: 0o.twin#1o.dyst omitted because of estimability.
note: 1o.twin#0o.dyst omitted because of estimability.
note: 0b.rp omitted because of estimability.
note: 0b.vag_disch omitted because of estimability.
note: 0o.rp#0o.vag_disch omitted because of estimability.
note: 0o.rp#1o.vag_disch omitted because of estimability.
note: 1o.rp#0o.vag_disch omitted because of estimability.

Wald test, begin with full model:
p = 0.9745 >= 0.0500, removing 1.vag_disch
p = 0.3571 >= 0.0500, removing parity
p = 0.2028 >= 0.0500, removing mwp
p = 0.1144 >= 0.0500, removing 1.rp
p = 0.1047 >= 0.0500, removing 1.twin#1.dyst
p = 0.0759 >= 0.0500, removing 1.dyst

      Source |       SS           df       MS      Number of obs   =     1,574
-------------+----------------------------------   F(4, 1569)      =     31.17
       Model |   979.60429         4  244.901072   Prob > F        =    0.0000
    Residual |  12326.6626     1,569  7.85638149   R-squared       =    0.0736
-------------+----------------------------------   Adj R-squared   =    0.0713
       Total |  13306.2668     1,573  8.45916519   Root MSE        =    2.8029

-----------------------------------------------------------------------------------------
                rootwpc | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
------------------------+----------------------------------------------------------------
           rp#vag_disch |
               yes#yes  |   1.928071   .5207635     3.70   0.000     .9066051    2.949537
                        |
              herd_size |  -.0207276   .0083839    -2.47   0.014    -.0371725   -.0042827
                        |
c.herd_size#c.herd_size |   .0000657   .0000173     3.80   0.000     .0000318    .0000997
                        |
                   twin |
                   yes  |   1.413782   .5492526     2.57   0.010     .3364358    2.491129
                  _cons |   8.516599   .9733102     8.75   0.000     6.607473    10.42572
-----------------------------------------------------------------------------------------

. * avoid splitting terms from a combined effect by (), always include first term
. stepwise, pe(0.049999) pr(0.05) lockterm1: regress rootwpc parity (c.herd_size##c.herd_size) c.mwp (i.t
> win##i.dyst) (i.rp##vag_disch) // now twin#dyst stays in although non-significant
note: 0b.twin omitted because of estimability.
note: 0b.dyst omitted because of estimability.
note: 0b.twin#0b.dyst omitted because of estimability.
note: 0b.twin#1o.dyst omitted because of estimability.
note: 1o.twin#0b.dyst omitted because of estimability.
note: 0b.rp omitted because of estimability.
note: 0b.vag_disch omitted because of estimability.
note: 0b.rp#0b.vag_disch omitted because of estimability.
note: 0b.rp#1o.vag_disch omitted because of estimability.
note: 1o.rp#0b.vag_disch omitted because of estimability.

Wald test, begin with full model:
p = 0.2194 >= 0.0500, removing mwp

      Source |       SS           df       MS      Number of obs   =     1,574
-------------+----------------------------------   F(9, 1564)      =     14.92
       Model |  1052.07609         9  116.897343   Prob > F        =    0.0000
    Residual |  12254.1908     1,564  7.83516033   R-squared       =    0.0791
-------------+----------------------------------   Adj R-squared   =    0.0738
       Total |  13306.2668     1,573  8.45916519   Root MSE        =    2.7991

-----------------------------------------------------------------------------------------
                rootwpc | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
------------------------+----------------------------------------------------------------
                 parity |   .0471693   .0481253     0.98   0.327    -.0472276    .1415661
              herd_size |  -.0224056   .0084408    -2.65   0.008    -.0389621   -.0058491
                        |
c.herd_size#c.herd_size |   .0000696   .0000175     3.99   0.000     .0000354    .0001039
                        |
                     rp |
                   yes  |   .4330348     .26975     1.61   0.109     -.096075    .9621446
                        |
              vag_disch |
                   yes  |   .0135432   .4019532     0.03   0.973    -.7748808    .8019672
                        |
           rp#vag_disch |
               yes#yes  |    1.46987   .7023592     2.09   0.037     .0922056    2.847535
                        |
                   twin |
                   yes  |   1.603972   .5858987     2.74   0.006     .4547421    2.753202
                        |
                   dyst |
                   yes  |   .6309143   .3111981     2.03   0.043     .0205049    1.241324
                        |
              twin#dyst |
               yes#yes  |  -2.795651   1.745789    -1.60   0.109    -6.219985    .6286827
                        |
                  _cons |   8.478238   .9899599     8.56   0.000     6.536449    10.42003
-----------------------------------------------------------------------------------------

. stepwise, pe(0.049999) pr(0.05) lockterm1: regress rootwpc parity (c.herd_size##c.herd_size) c.mwp i.tw
> in##i.dyst (i.rp##vag_disch) // the model we already had 
note: 0b.twin omitted because of estimability.
note: 0b.dyst omitted because of estimability.
note: 0o.twin#0o.dyst omitted because of estimability.
note: 0o.twin#1o.dyst omitted because of estimability.
note: 1o.twin#0o.dyst omitted because of estimability.
note: 0b.rp omitted because of estimability.
note: 0b.vag_disch omitted because of estimability.
note: 0b.rp#0b.vag_disch omitted because of estimability.
note: 0b.rp#1o.vag_disch omitted because of estimability.
note: 1o.rp#0b.vag_disch omitted because of estimability.

Wald test, begin with full model:
p = 0.2194 >= 0.0500, removing mwp
p = 0.1095 >= 0.0500, removing 1.twin#1.dyst
p = 0.0759 >= 0.0500, removing 1.dyst

      Source |       SS           df       MS      Number of obs   =     1,574
-------------+----------------------------------   F(7, 1566)      =     18.32
       Model |  1007.23309         7  143.890441   Prob > F        =    0.0000
    Residual |  12299.0338     1,566  7.85378911   R-squared       =    0.0757
-------------+----------------------------------   Adj R-squared   =    0.0716
       Total |  13306.2668     1,573  8.45916519   Root MSE        =    2.8025

-----------------------------------------------------------------------------------------
                rootwpc | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
------------------------+----------------------------------------------------------------
                 parity |   .0384512    .047746     0.81   0.421    -.0552016    .1321039
              herd_size |  -.0213327   .0084356    -2.53   0.012     -.037879   -.0047863
                        |
c.herd_size#c.herd_size |    .000067   .0000174     3.84   0.000     .0000328    .0001011
                        |
                     rp |
                   yes  |   .4636612   .2694853     1.72   0.086    -.0649288    .9922512
                        |
              vag_disch |
                   yes  |   .1276652    .399162     0.32   0.749    -.6552831    .9106136
                        |
           rp#vag_disch |
               yes#yes  |   1.377465    .700519     1.97   0.049     .0034113    2.751519
                        |
                   twin |
                   yes  |   1.309327   .5523056     2.37   0.018      .225991    2.392664
                  _cons |   8.442287   .9910022     8.52   0.000     6.498456    10.38612
-----------------------------------------------------------------------------------------

.