Sampling is a statistical procedure of drawing a small number of elements from a population and drawing conclusions regarding the population.
A population (also called a universe) is the total collection of all the population elements, each of which is a potential case.
All students in a college, for example, constitute a population of interest, and each student in the college questioned about his/her age, height weight, or opinion on any issue is a population element.
A census is an investigation or a count of all the population elements. Any part of the population is a sample. If a sample is selected according to the rules of probability, it is a probability sample or random sample.
If a sample is random, then it is possible to calculate how representative the sample is of the wider population from which the sample was drawn. The counterpart of the probability sample is the so-called non-probability sample.
Non-probability sampling is a non-random and subjective method of sampling where the selection of the units depends on the personal judgment of the sampler.
A survey is a general term that refers to the collection of data by means of interviews, questionnaires, or observations.
A sample survey is a study involving a subset (or sample) of individuals selected from a larger population by accepted statistical methods.
A distinction is sometimes made between the target population and the sampled population. In research, the target population is the entire set of units for which the survey data is to be used to draw conclusions and make inferences.
It can also be defined as the eligible population that is included in research work. It is also called the survey population.
Ideally, the two will be the same, but for practical reasons, there will usually be differences between them.
One reason for the difference is that some members of the population in a survey are not covered, and so no information is obtained for them.
A population may be finite or infinite, depending on whether the sampling units are finite or infinite.
By definition, a finite population contains a countable number of sampling units, for example, all registered voters in a particular city in a given year or all customers who visited the city store in May 2006.
An infinite population consists of an endless number of sampling units, such as the number of coin tosses until a head appears. Sampling designed to produce information about particular characteristics of a finite population is usually called survey sampling.
A sampling unit or simply a unit is a well-defined, distinct, and identifiable element or group of elements on which observation is made.
In some studies, an individual in a household may be a sampling unit, while in another study, the household may be a sampling unit. A sampling frame is a list of units or groups of units in the population to be sampled.
Sample size refers to the number of units contained in a sample, while population size is the number of units that constitute the population.
The population characteristics about which the inferences are made are called parameters.
For a given sample design, an estimator is a method or formula for estimating the value of the parameter.
An estimate is the numerical value of the estimator obtained from the sample. Bias is a term that refers to how far the average value of the estimator lies from the parameter.
Sample design refers to the plans and methods to be followed in selecting a sample from the target population and the estimation technique vis-a-vis formula for computing the sample statistics.
These statistics are the estimates used to infer the population parameters.
Implicit in the concept, the sampling design also includes such issues as the choice of the sampling frame, determination of the size of the sample, estimation of reliability of the estimates, stratification procedure, sample allocation method, clustering of the sample, etc.
Survey design is the process of preparing a complete plan of operations to be followed in conducting a survey and disseminating its intended results.
We will elaborate on the concept of designing a survey in a separate section.
We, however, emphasize that the survey objectives covered under survey design determine the sample design, and in practice, the sample design must be developed as an integral part of the overall survey design.
Survey design and sample design are thus the two interrelated concepts, and one is complementary to the other.
It is almost always desired that a sample design be evaluated for its perfection, and a perfect sample design is expected to meet certain criteria, which include, among others, the criteria of accuracy, reliability, validity, and efficiency.
Importance of Sampling
A sample is taken almost always to provide statistical data on an extensive range of subjects for both research and administrative purposes.
The following examples are designed to illustrate the importance of sampling in real life:
- In an opinion poll, a relatively small number of persons are interviewed, and their opinions on current issues are solicited in order to discover the attitude of the community as a whole.
- Marketing and advertising agencies conduct countless inquiries to determine customers’ expectations, attitudes, buying habits, or shopping patterns. This information is useful to the manufacturers of goods for sales promotion. Since it is impossible to procure this information from countless customers, it is achieved through interviewing a part of the customers.
- Large lots of manufactured products are accepted or rejected by purchasing departments in business or government following inspection of a relatively small number of items drawn from these lots.
- At border stations, customs officers enforce the laws by checking the effects of only a small number of travelers crossing the border.
- A department store wishes to examine whether it is losing or gaining customers by drawing a sample from its list of credit card holders by selecting every tenth name.
- Auditors often judge the extent to which the proper accounting procedures have been followed by examining a small number of transactions, selected from a large number of such transactions taking place within a specified period of time.
- Ministry of Health and Family Welfare might be interested to know the status of knowledge among the adult population in Dhaka city on the danger of environmental pollution by interviewing a few selected adults of the city.
Countless measurements of the economy, health, labor force, contraceptive use, immunization, unemployment, income, export, import, industrial products, and the like rely on samples, rather than on complete enumeration.
Numerous surveys are being conducted to develop, test, and refine hypotheses in such disciplines as sociology, psychology, demography, political science, anthropology, geography, economics, education, and public health.
Both local and central governments make considerable uses of survey data to be aware of the various population characteristics for planning and development purposes.
In every case, a sample is selected because it is impossible, inconvenient, slow, or uneconomical to monitor the entire population.
It is now widely agreed that a sample survey is a popular and scientific method of data collection.
Sampling With and Without Replacement
A sample may be drawn with replacement (SWR) or without replacement (SWOR).
If the sample is taken with replacement from a population, finite or infinite, the unit drawn is returned to the population, and the number of units available for future drawing is not affected, and consequently, the probability of drawing any remaining unit in successive selections will remain unaltered.
In sampling without replacement, the unit drawn is not returned to the population in subsequent drawings. Unlike sampling with replacement, the probability of drawing any remaining unit in successive selections will be increased.
Sampling with replacement is sometimes referred to as an unrestricted sampling. In general, sampling with replacement is less precise than sampling without replacement.
Intuitively, sampling with replacement seems rather wasteful.
In practice, almost all sampling is done without replacement, since there is little justification for studying the characteristics of the units, which have already been included in the previous selection.
Sampling with replacement is of interest primarily for theoretical interest since the formula for the variance, and estimated variance of the estimators are often simpler when the sampling is made with replacement than when it is made without replacement.
Suppose we have 4 members in a family to whom we assign serial numbers 1, 2, 3, 4.
We need to select two of them for an interview. In this particular instance, we say that we have a population of size 4 (i.e., AM) from which a sample of size 2 (i.e., m=2) is to be selected.
To select these two members without replacement, there will be altogether 6 possible samples of each of 2 members. The accompanying table displays all possible samples of size 2:
|Table: Samples of Size 2 without Replacement|
|Samples consisting of Serial numbers|
In practice, we deal with only one sample, which might be any one of the above cases.
If sample number 3 happens to be selected, we will seek information from member 1 and member 4 to meet our survey objectives.
Refer to the Example above. If sampling is done with replacement, there will be 16 possible samples, each of size 2. Table 5.2 shows these samples.
|Table: Samples of Size 2 with Replacement|
|Sample #||Serial numbers||Sample #||Serial numbers|
|6||(2, 2)||14||(4, 2)|
|8||(2, 4)||16||(4, 4)|
Note that sample 3 (for example) and sample 9 are, in fact, identical. If we select sample 3, we do not obtain any extra information by selecting sample 9.