Study on Parameter Estimation via Multistage Sampling with Applications




Journal Title

Journal ISSN

Volume Title



Over the past few decades, researchers have been (and are still being) encouraged to report confidence intervals along with their parameter estimates instead of just the binary outcome of a hypothesis testing based on an arbitrary cut of value for p − value (mostly α = 5%). However, researchers traditionally define their sample sizes in advance before sampling is done. This naturally may lead to wide confidence intervals. Ceteris paribus, wider confidence intervals indicate higher uncertainty and this discourages researchers from reporting the confidence intervals. As a remedy to this problem, sample size planning methods, such as accuracy in parameter estimation and power analysis, were developed. These methods seek to determine the appropriate sample sizes needed to obtain sufficiently narrow confidence intervals in the case of accuracy in parameter estimation, or high power in the case of power analysis. One drawback of these methods is that they require researchers to provide the values of some population parameters which are generally unknown in advance. Thus, the use of suppose population values, which are different from their true population values, in these methods will result in wrong sample size calculations. Incorrect sample sizes then also lead to incorrect inferences or decisions. Another drawback of these traditional methods is the assumption of the distribution from which the data are sampled. There is no reason to assume that data will always follow a particular distribution, say normal, in every situation. To overcome these challenging assumptions, multi-stage procedures which have been around for more than half a century can be used. We therefore develop multi-stage sampling procedures for constructing sufficiently narrow confidence intervals for parameters with a pre-specified confidence level and pre-specified upper bound on the width of the confidence interval. We do this for a general class of effect sizes, different types of correlation measures, and the Gini index. Our methods do not require the knowledge of population parameters or the distribution from which the data are sampled. In other words, our methods work in a distribution-free environment with no requirement for knowledge of population values. In our procedure, the sample size needed to obtain a sufficiently narrow confidence is not specified a priori. Rather, a stopping rule, which will be defined, determines whether after a pilot sample is obtained, additional samples will be needed or not. We provide theorems with their proofs to support our procedures and demonstrate their characteristics with Monte Carlo simulations. In the case of the Gini index, we also provide an application to the 64th National Sample Survey in India.



Sequential analysis, Parameter estimation, Sampling (Statistics), Effect sizes (Statistics), Correlation (Statistics), Gini coefficient, Confidence intervals


©2018 The Author. Digital access to this material is made possible by the Eugene McDermott Library. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.