| Abstract: | We devise methods for estimating the parameters of a prospective logistic model with dichotomous response D and arbitrary covariates X from case-control data when these covariates are measured with error. We suppose that some fraction of the cases and controls provide only the error-prone covariate measurements, W (the "incomplete" or "reduced" data), whereas some of the cases and controls provide measurements on X and W (the "complete" data). We assume a measurement error density with a finite set of parameters a, namely fwlxD(wlx, d, a), and nondifferential error is treated as a special case of this model, fwlx(wlx, a). Our algorithm estimates both the logistic parameters and a from a pseudolikelihood. Because empirical distribution functions are used in place of needed distributions in the pseudolikelihoods, the required asymptotic theory is more elaborate than for pseudolikelihoods based on substitution for a finite number of nuisance parameters. We also examine computationally simpler methods under the assumptions that the disease is rare and that errors are nondifferential. Estimates of m(W) = E(X l W) are substituted for X in the logistic model when X is not available. Such estimates of m(W) can be obtained from the complete data described above or from an independent validation study. If measurements on X are not available, m(W) can still be estimated from replicated W measurements in some circumstances. A final approach uses approximate logistic regression techniques and is appropriate when a more accurate approximation is required than obtained by simply substituting m(W) for X. Asymptotic theory is presented for each of these procedures, and examples are used to illustrate the calculations. |