Probability Sampling - A Guideline for Quantitative Health Care Research

Adwok J

Nairobi Hospital

Correspondence to: Prof. John Adwok, P.O Box 21274-00505, Nairobi, Kenya. Email:



This essay discusses factors considered by researchers when developing a sampling plan including the frame, sampling unit, sample size, target population, precision, and stratification. The sampling methods of probability, both simple and systematic were also defined and compared on their utility for sampling populations. The usefulness of sampling as applied in a quantitative survey study is illustrated by evaluating an article using the characteristics of comprehensiveness, probability of selection, and efficiency.

Keywords: Probability Sampling, Quantitative Research, Sample Size.

Ann Afr Surg. 2015; 12(2): 95-99.


Sampling has received varied definitions by major authors on social research methods. It has been defined as “the process of selecting a smaller group of participants to tell us essentially what a larger population might tell us if we asked every member of the larger population the same questions” (1). A more direct definition is the method used for selecting a given number of people (or things) from a population (2). The desire to draw inferences about a large population from a subset of that population is the main concern for a researcher. Therefore, the researcher must ascertain that the sample truly represents the population by using strategies of selecting an appropriate sample that address bias and possible distortion of data (3).

Its success in representing a population depends on how well the sample frame corresponds to the description of the chosen population, the sampling procedure giving each person a known chance for selection and whether it influences the precision of sample estimates. In this way the research results can be used to make generalizations about the entire population (3). The use of a probability sampling procedure offers each member of a population or the sample frame an equal chance of being selected and improves external validity.

Probability Sampling

Probability sampling specifies to the researcher that each segment of a known population will be represented in the sample. Probability samples lend themselves to rigorous analysis to determine the likelihood and possibility of bias and error (2). Random selection is the process of choosing the components of a sample that ensures each member of a population stands the same chance of selection (3,4). The characteristics of the sample are assumed to be similar to the characteristics of the total population it is drawn from. The initial step in choosing a sample, therefore, is to define the sample frame.

Sample Frame

The sample frame represents those individuals who have a chance to be included among those selected in a sample selection procedure (4). Examples of sampling frames include (a) learners enrolled in a graduate school, (b) a city phone directory, and (c) members of a golfing club. It is desirous to have a complete and updated sample frame list that conforms to the target population of study. Population validity is said to be established when the accessible population represents the target population.

It should enable the calculation of an individual’s

probability of selection, and include a high number of members of the target population (2,4). Once the sample frame and sample size have been determined, the researcher proceeds to select the sample randomly from the frame. There are various methods of random selection including the use of a table of random numbers, using a lottery procedure drawing well mixed numbers, and computer programs that determine a random selection of sampling units.


Simple Random Sampling

Simple random sampling has been defined as “a type of probability sampling in which the units composing a population are assigned numbers. A set of random numbers is then generated, and the units having those numbers are included in the sample” (5). For example, if a simple random sample of 100 individuals is required from a sample frame of 8,500 individuals (listed from 1- 8,500) , a straight forward selection could be made using a computer table of random numbers or some other generator of random numbers to produce a 100 different numbers within the same range (4). A simpler but more tedious way of selecting a sample randomly is to put all the names or numbers in a hat and draw the sample that way. Despite this being a simple process, simple random sampling is not commonly used by researchers. There are also concerns about its accuracy. A major risk of random sampling is when some individuals with important characteristics to the study are left out. Such a situation could arise as a result of under sampling or because certain individuals will not be available during sample selection and will therefore, be excluded (1). To mitigate this, systematic sampling may be used


Systematic Sampling

Systematic sampling is a type of probability sampling in which every unit or individual is selected according to a predetermined sequence from a list. The researcher first determines the number of entries on a list and the desired sample size before computing the sampling interval (k) by dividing the size of the population by the desired sample size (5). If the researcher wishes to select a sample of 100 individuals from a list of 8,500 individuals, he or she will divide 8,500 by 100 to generate the sampling interval which equals 85 (3). The first unit is typically selected at random anywhere between 1 and 85 to ensure a chance selection process. Commencing from the randomly selected number between 1 and 85, a sample of 100 individuals is then selected. The attraction of systematic sampling is that the researcher does not need to have a complete list of all the sampling units. Yet, caution is needed when using systematic sampling.


Although systematic sampling is considered a functional equivalent of simple random sampling and is usually easier to use, researchers need to pay special attention to ordering of the sample frame by any characteristic or some recurring pattern that will affect the sample (1). For example, an organization that lists its employees by ethnic origin could create errors of random selection in a study using systemic sampling as random starts at different points may not provide the same representation of the employees. Issues raised by listings ordered by some characteristic or with a recurring pattern could be resolved by reordering the list or adjusting the intervals used for the selection of units (4).

Sample Size

Controversies still exist as to what constitutes the correct sample size for a study. Some researchers disagree with the common practice of deciding sample sizes using specific fractions of the population, tailoring predetermined sample sizes to specific populations, and calculating confidence intervals (4). The size of the target population from which a particular size of sample is withdrawn may not affect how well the sample will describe the population. For example, a sample of 150 people will similarly describe a population of 15000 and 15 million with the same degree of accuracy assuming the sampling procedures and design match (4). While admitting there are many ways to increase the reliability of survey estimates, it is recommended that researchers first analyze a study’s goals as a first step on deciding the sample size (4). These observations have obvious implications for inexperienced researchers planning to conduct a survey type study. In an effort to help researchers with sample size estimations, statisticians have developed internet based programs for determining desired sample sizes for populations. The simplicity and ease of access of the online sample calculators have made them popular with researchers. Internet based calculators p