ABSTRACT

This chapter reviews small-sample methods for inference of odds ratios in logistic regression models for case-control data. In logistic regression, the maximum likelihood estimator (MLE) of the regression parameters is biased away from zero, particularly when the number of parameters is large compared to the sample size [23]. In addition, the first-order normal approximation to the distribution of the MLE becomes unreliable in small samples [e.g.,][]MehtaPatel95. These two factors lead to biased likelihood-based tests and point and interval estimators in small samples. Another problem in small or sparse data sets is nonexistence of the MLE due to separation [3]. By “separation,” we mean that there is a linear combination of the covariates that perfectly distinguishes cases from controls. When there is separation in the data set, the likelihood does not have a maximum and so the MLE does not exist.