Pseudo SAS Code for Cox Model

Introduction

The following pseudo code assumes two arms, treatment vs control. When there are multiple treatment groups vs one control group, for each run of the code, only include one treatment vs the control.

The relevant SAP texts are as follows:

Cox regression with multiple baseline variables

Multivariable Cox regression analysis will be carried out to assess and adjust the treatment effect for relevant baseline factors of potential prognostic impact. A stepwise selection procedure will serve to identify explanatory variables of potential prognostic values additional to the randomization strata which will be included in all models during the selection procedure. The Cox's Proportional Hazard model is defined as:

\[ h(t) = h(0;t) e^{Xb} \]

where \(h(0;t)\) defines the baseline hazard function and \(X\) defines the vector of explanatory variables and b the unknown vector of regression parameters.

In the stepwise selection procedure, variables are entered into and removed from the model in such a way that each forward selection step can be followed by one or more backward elimination steps. The stepwise selection process terminates if no further variable can be added to the model or if the variable just entered into the model is the only variable removed in the subsequent backward elimination. The level of significance for an explanatory variable to enter the model is set to 0.15 (p-value of Score test) and the significance level for removing it is set to 0.40 (p-value of Wald test). This analysis will be performed using the stepwise selection method in SAS (Proc PHREG). Once this procedure stops, the factor treatment arm will be added to the last selected model in order to evaluate the effect of treatment on OS time when adjusted for the selected explanatory variables. The hazard ratios of all selected explanatory variables and of treatment effects will be reported including 2-sided 95% CIs. No interactions will be considered. Post-baseline factors will not be considered for the model.

Subset analyses

To assess the heterogeneity of treatment effects for PFS and OS across the subgroup levels, two Cox regression model will be fitted with PFS or OS, respectively, as the dependent variable and subgroup, treatment, and with and without the treatment-by-subgroup interaction as explanatory variables.

  • Model 1: factors + treatment + subgroup
  • Model 2: factors + treatment + subgroup + treatment \(\times\) subgroup-variable

A p-value for the interaction test (Likelihood Ratio test) will be provided together with the HR and corresponding 95% CI of the interaction model parameter.

Multivariable Cox regression

This is a 2 step process.

Step 1 – select the variables

1
2
3
4
5
6
7
8
9
10
proc phreg data= ADTTE;
class gender stage (ref='Stage 1');
model TTE * CENSOR (value_indicating_censored_observation) = stage age gender/
alpha=0.05 RL ties=discrete
selection=stepwise
slentry=0.15
slstay=0.4
include=1
details;
run;

Option selection specifies the method to conduct the model selection, adjust according to SAP.

Option slentry is p-value entry threshold for a variable, adjust according to SAP.

Option slstay is p value stay threshold (not to be removed from the model once in) for a variable, adjust according to SAP.

Option include is how many variables counting from the left after the equal sign to always keep in the model.

In this example stage is the stratification variable, and include=1 will always keep it in the model. If you have multiple stratification variables, then add them as the first x variables after the equal sign, and modify the value for include=x. Example:

1
model TTE\*CENSOR (0)= strat1 strat2 strat3 OtherVar1 OtherVar2 OtherVar3/ include=3;

Option detail gives the model selection steps in the output.

Note treatment is not included in this step.

Step 2 – final model to report

After you got the final variable selections, run the following model (adding treatment here).

1
2
3
4
proc phreg data= ADTTE;
class TREATMENT (ref=value_for_control_arm) gender stage (ref='Stage 1');
model TTE * CENSOR (value_indicating_censored_observation) = TREATMENT SelectedVariable1 SelectedVariable2 / alpha=0.05 RL ties=discrete ;
run;

The following table shows the outputs.

Interaction test in Cox model

The p-value for the interaction test (Likelihood Ratio test) is requested via the type1 option in the model statement. Note the sequence of parameters must have the interaction term at the right most position (see the green text for sequence of parameters).

This code is run for each factor separately, e.g. run the model for age, run a separate model for gender and so on.

1
2
3
4
proc phreg data=ADTTE;
class TREATMENT (ref=value_for_control_arm);
model TTE * CENSOR (value_indicating_censored_observation) = TREATMENT VAR1 TREATMENT* VAR1 / alpha=0.05 RL ties=discrete type1;
run;

Pick the yellow highlighted p-value.

Likelihood Ratio Statistics for Type 1 Analysis
Source-2 Log LDFLR Chi-SquarePr > ChiSq
(Without Covariates)3430.3836...
arm3403.7614126.6222<.0001
age3403.652010.10940.7409
age*arm3397.579916.07210.0137

HR for the interaction term = exponential (Parameter_Estimate).

Lower= Parameter_Estimate – 1.96* Standard Error

Upper= Parameter_Estimate + 1.96* Standard Error

Lower bound of 95% CI for HR = exponential (Lower)

Upper bound of 95% CI for HR = exponential (Upper)

In the example below:

HR = exp(-0.00848) = 0.992

Lower = -0.00848 – 1.96*0.00959 = -0.02728

Upper= -0.00848 + 1.96*0.00959 = 0.010316

Lower bound of 95% CI for HR = 0.973

Upper bound of 95% CI for HR = 1.010

If the covariate is categorical and it has more than 2 levels (including the reference level), then there will still be only one p value for the interaction term. However, you will have multiple rows for the Parameter_Estimate and Standard_Error. Using the instructions above, get the HR and 95% CI for each. Example below shows a variable, stage, with 3 levels (level 1 is reference).

The p-value output:

Likelihood Ratio Statistics for Type 1 Analysis
Source-2 Log LDFLR Chi-SquarePr > ChiSq
(Without Covariates)3376.5810...
arm3344.8694131.7116<.0001
stage3343.840721.02870.5979
arm*stage3341.429022.41160.2994

The HR and 95% CI: