Study on Parameter Estimation via Multistage Sampling with Applications
Bilson Darku, Francis
MetadataShow full item record
Over the past few decades, researchers have been (and are still being) encouraged to report conﬁdence intervals along with their parameter estimates instead of just the binary outcome of a hypothesis testing based on an arbitrary cut of value for p − value (mostly α = 5%). However, researchers traditionally deﬁne their sample sizes in advance before sampling is done. This naturally may lead to wide conﬁdence intervals. Ceteris paribus, wider conﬁdence intervals indicate higher uncertainty and this discourages researchers from reporting the conﬁdence intervals. As a remedy to this problem, sample size planning methods, such as accuracy in parameter estimation and power analysis, were developed. These methods seek to determine the appropriate sample sizes needed to obtain sufficiently narrow conﬁdence intervals in the case of accuracy in parameter estimation, or high power in the case of power analysis. One drawback of these methods is that they require researchers to provide the values of some population parameters which are generally unknown in advance. Thus, the use of suppose population values, which are different from their true population values, in these methods will result in wrong sample size calculations. Incorrect sample sizes then also lead to incorrect inferences or decisions. Another drawback of these traditional methods is the assumption of the distribution from which the data are sampled. There is no reason to assume that data will always follow a particular distribution, say normal, in every situation. To overcome these challenging assumptions, multi-stage procedures which have been around for more than half a century can be used. We therefore develop multi-stage sampling procedures for constructing sufficiently narrow conﬁdence intervals for parameters with a pre-speciﬁed conﬁdence level and pre-speciﬁed upper bound on the width of the conﬁdence interval. We do this for a general class of effect sizes, different types of correlation measures, and the Gini index. Our methods do not require the knowledge of population parameters or the distribution from which the data are sampled. In other words, our methods work in a distribution-free environment with no requirement for knowledge of population values. In our procedure, the sample size needed to obtain a sufficiently narrow conﬁdence is not speciﬁed a priori. Rather, a stopping rule, which will be defined, determines whether after a pilot sample is obtained, additional samples will be needed or not. We provide theorems with their proofs to support our procedures and demonstrate their characteristics with Monte Carlo simulations. In the case of the Gini index, we also provide an application to the 64th National Sample Survey in India.