TransWikia.com

Output the linear predictor from a stratified cox model?

Bioinformatics Asked on December 13, 2021

I have fitted a cox model on a pooled dataset of multiple studies, say studies A, B, and C. As these studies have different baseline hazard functions, I stratified for ‘study’ in the model.

Like this:

df$SurvObj <- with(df, Surv(event_rd, event == 1))

fit <- coxph(SurvObj ~ cov1 + cov2 + cov3 + cov4 + strata(study),data=df)

Now, I want to assess whether I can use the beta coefficients from the above model to predict the event probability in a new study, study D.
The baseline hazard in study D is different from those in studies A, B, and C.

When I have

predict(fit,type="lp",newdata=studyD,reference="strata")

I get this error:

Error in model.frame.default(data = studyD, formula = ~hba1c + sbp +
: factor strata(study) has new levels study=4

Why does R require that I match the strata of study D with those in study A, B, and/or C when I only want to output the linear predictor (type="lp")? As the linear predictor is independent of the baseline hazard function in this case.

I suppose it should be possible to extract the linear predictor for individuals in study D and then manually calculate the event probability using the baseline hazard of study D.

Does anybody know how to do this?
I could not find any question like this so maybe I am thinking the wrong way.

One Answer

Unlike wikipedia, the linear predictor (LP) in coxph is made dependent from the baseline hazard function $lambda$. For example, if a model has two predictors x1, x2 stratified by sex, the linear predictor would be like: $$LP = log ( lambda _{sex}) + beta _{1}x _{1}+beta _{2}x _{2}$$

Where the $log(lambda _{sex})$ can be obtained by predict() with x1=x2=0.

We could calculate the LP without strata information (sex) by hand using the $beta _{1} , beta _{2}$ provided by coef(fit) and a customised $lambda _{unkown , sex}$.

Below is a simulation comparing the results from predict() and by hand coef(fit).

## Toy data
df <- list(time=c(4,3,1,1,2,2,3), 
          status=c(1,1,1,0,1,1,0), 
          x1=c(0,2,1,1,1,0,0), 
          x2=1:7,
          sex=c(0,0,0,0,1,1,1)) 

## Fit a stratified model 
fit <- coxph(Surv(time, status) ~ x1 + x2 + strata(sex), df) 
coef(fit)
#         x1          x2 
# 0.79451881 -0.01917633

## Baseline linear predictor
lambda_sex1 <- predict(fit, newdata=list(x1=0, x2=0, sex=0))
# -0.746578
lambda_sex2 <- predict(fit, newdata=list(x1=0, x2=0, sex=1))
# -0.1497816

## by predict function
predict(fit, newdata=list(x1=1, x2=2, sex=0))
# 0.009588167
predict(fit, newdata=list(x1=1, x2=2, sex=1))
# 0.6063845

## by hand
lambda_sex1 + coef(fit)[1]*1 + coef(fit)[2]*2
# 0.009588167
lambda_sex2 + coef(fit)[1]*1 + coef(fit)[2]*2
# 0.6063845

Finally, to calculate "a" linear predictor for an unknown sex:

# prevented by predict function
predict(fit, newdata=list(x1=1, x2=2, sex=2))
#Error in model.frame.default(data = list(x1 = 1, x2 = 2, sex = 2), formula = ~x1 +  : 
# factor strata(sex) has new level sex=2

# by hand, assuming lambda for sex 2 is 0
0 + coef(fit)[1]*1 + coef(fit)[2]*2
# 0.7561661

Answered by Ryan SY Kwan on December 13, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP