This paper considers the problem of endogenous sampling in the duration model. This is an important problem in the duration analysis of bank failures and loan defaults because it is common for the researchers in these areas to use only the default sample or non-default sample or both at a certain ratio, rather than using a random sample. The properties of endogenous sampling have been considered in various models, notably in qualitative response models, but not in duration models as far as I am aware. In this paper, I obtain the asymptotic distribution of the endogenous sampling maximum likelihood estimator and compare it with that of the random sampling maximum likelihood estimator and indicate when efficiency gain may result. I also show that the random sampling maximum likelihood estimator is inconsistent if the data are collected by endogenous sampling.
Keywords: Duration models; Endogenous sampling; Bank failure; Loan default; Insolvency; Maximum likelihood estimator; Asymptotic distribution
Views expressed in the paper are those of the authors and do not necessarily reflect those of the Bank of Japan or Institute for Monetary and Economic Studies.