Produced with Scholar

Module A3 (2018) Survey ANALYST Creator project

Project Overview

Project Description

IMPORTANT: THIS PROJECT IS ONLY FOR SURVEY ANALYSTS.

Your Creator assignment is to draft an analysis plan that contains the following sections and tasks.

  1. Describe data cleaning checks
  2. Describe your plan for weighting
  3. Prepare table shells for five indicators
  4. Calculate the results for one of your tables
  5. Generate a graphical summary of one vaccination coverage indicator for one dose across all 13 strata
  6. Summarize methods
  7. Summarize results
  8. Identify caveats or concerns
  9. Identify strengths and limitations

DO NOT START THIS PROJECT IF YOU ARE A SURVEY MANAGER.

 

Icon for Survey ANALYST Creator project

Survey ANALYST Creator project

Multi-Indicator Cluster Survey and National Immunization Coverage Survey in Nigeria

The dataset used for this project is from the combined Multi-Indicator Cluster Survey and National Immunization Coverage Survey that was conducted in 2016 and 2017 in two zones of Nigeria.

1. Data Cleaning Checks

The data cleaning is a time consuming step when performing over all variables and all records. But considering the importance of this step adequate time and resources should be allocated ( at least one week for this survey). For the data cleaning it is recommended to use appropriate statistical software eg: SPSS, STATA, 

  1. Check all data for duplicate, missing or conflicting data in various fields using statistical software
  • Specially ID variables should be checked for uniqueness, completeness and missing data– Stratum ID, Cluster ID, Household ID, Respondent ID
  • eg: Sorting the dataset by the ID variables to identify missing values and duplicates
  1. Check all values for correctness, completeness and consistency using validity checks
  • Specially Date Variables- date of birth, vaccination dates, age at vaccination etc. eg: Child’s Birthdate should not be a nonsensical combination of numbers like February 30 or partial number or recorded differently in different health records (date recorded by history should check with the photograph of the home-based reports if available), Date variables should be checked for acceptable range (eg: Child's age: should be within12-23 months at the time of interview, date of interview should fall within the dates that interview team visited the particular cluster)
  • Flag disallowed & questionable responses (Implausible or illogical response) for review eg: If the record says the respondent showed the vaccination card, there should be at least one tick mark on the vaccine or vaccine dose
  • Check the flagged responses with the photograph of the home-based reports if it is not available or incase of discrepency call & clarify with the respondent
  1. Correct the incorrect responses by re-coding responses (amend the dataset to include correction)
  • Change uncorrected values / improbable values to missing (Data manager & survey manager decide on consistent policy regarding errant values that cannot be checked)
  1. Evaluate whether skip patterns were correctly observed
  2. Data cleaning steps should be justified and documented clearly in Annexure
  • number of errant values
  • method used to check
  • how many values corrected
  • justification for correction
  • how many couldn’t be corrected

2. Plan for weighting

  • ​In multistage sampling approach, sampling probabilities will likely differ for different respondents. Therefore analysis will most likely need to be weighted.
  • A sample weight is a statistical measure that emphasises the contribution of each respondent to the population from which they were sampled. When a survey calculation is weighted, it means that each person selected for the sample represents a certain number of similar eligible persons from the population.
  • The calculation of sampling weights can be done in 3 steps
  1. Calculate the design weight
  2. Adjusting for nonresponse
  3. Post-stratifying to match population totals
  1. Calculate the design/sampling weight
  • Should be done in all vaccination coverage surveys to address the sampling design of the survey.
  • Design weight is the inverse of the probability of each respondent was selected into the survey sample.

Design weight of respondent A = 1 / Probability of selection of respondent A in survey sample (PA)

PA = Stage I Probability * Stage II Probability * Stage III Probability * Stage IV Probability

  • As this survey is a multistage survey, probability of selection of respondent A in survey sample will equal the product of the probability of selection at each stage. For example when considering a probability of selecting individual respondent A for the survey (PA),
  • Stage I probability = Probability of selecting the respondent A’s stratum for the survey from the all possible strata of the sampling frame
  • Stage II probability = probability of selecting respondent A’s cluster for the survey from the list of all clusters of the respondent A’s stratum
  • Stage III probability = probability of selecting respondent A’s household for the survey from the list of households in the respondent A’s Cluster
  • Stage IV probability = probability of selecting the respondent A for the survey from all the eligible respondents of the household
  • In this survey, every eligible respondent should have been interviewed. Therefore stage IV probability = 1
  1. Adjusting for nonresponse – Response weight
  • Data may be missing from individual respondents, because the respondent was not available, refused or not-able to participate for the interview. To ensure that the survey results represent the target population, an adjustment is made to the design weight to transfer the sampling weight of the non-respondents to the respondents of the survey.
  • Response weight is the inverse of the response rate
  • To calculate the response weight, response rate is computed at the stratum level for households and for individuals
  • Response weight of respondent = 1 / Response Rate
  • Response Rate = Stage I response rate * Stage II response rate * ()
  • Response Rate = Number of eligible with a complete interview / Number of eligible per stratum
  • Important data to calculate Response Rate
  • How many households were not interviewed despite repeated visits
  • How many eligible respondents did not participate
  • Number of eligible respondents in each household in the survey sample, as identified by an occupant of the household (preferred) or by a neighbour
  1. Post-stratifying to match population totals – Post Stratified Weights
  • Post-stratified weights are adjusted to make the sum of weights in each stratum proportional to the known eligible population, if such population totals are known to be accurate. Pooling the estimates across strata could be used to calculate a national coverage estimate. Post stratification is also applicable in an oversampled population in a stratum of interest, relative to their portion of the overall population, in order to obtain precise coverage estimates for that stratum.
  • But when considering this survey, as the data were collected by a Multiple Indicator Cluster Survey (MICS) team and the outcome of every selected household is recorded using tablets, the quality of household listing should be high. Therefore the survey data provides a good estimate of the relative proportion of respondents across strata and no need of post stratification of weights.
  • Therefore final weights should be calculated by,

 Final weights = Design weights * Response Weights

 

3. Table shells for five indicators

Table Shells

a) Crude Coverage

  • Table 1: Crude Vaccination Coverage among children aged 12 –23 months, disaggregated by the states of North East Zone of Nigeria & Gender of the child - attached as a PDF file
Crude Penta3 Coverage - North East
  • Table 2: Crude Vaccination Coverage among children aged 12 –23 months, disaggregated by the states of South South Zone of Nigeria & Gender of the child - attached as a PDF file
  • Interpretation: “Percentage of the population who were eligible for the survey are estimated to have received <vaccine/dose>, as documented by home based record or caregiver's recall.”
  • Denominator: Sum of weights for all respondents
  • Numerator: Sum of weights for respondents who received the vaccine dose per home based record or recall
  • For each vaccine point estimate and 95% Confidence Interval will be computed for all children, male children and female children in each state of the two zones
  • Weighted indicator
  • Reference: VCQI results Interpretation Quick reference Guide

b) Dropout

  • Table 3: Dropout rates between different vaccine-dose combinations, by the states of North East Zone of Nigeria - attached as a PDF file
Dropout rate - North East
  •  Table 4: Dropout rates between different vaccine-dose combinations, by the states of South South Zone of Nigeria - attached as a PDF file
Dropout rates - South South
  • Denominator: Number of respondents who received the first dose and were age-eligible to receive the second dose before the survey
  • Numerator: Number of respondents who received the first dose and who were eligible but did not receive the second dose
  • Interpretation:Among the <N> children who showed evidence of having received <earlier dose>, (per card or recall ) and who were age-eligible to have received <later dose>, <dropout>% did not show evidence of receiving <later dose>.”
  • Dropout is the unweighted average of the indicator variable
  • Reference: VCQI results Interpretation Quick reference Guide

c) Valid Coverage

  • Table 5: Valid Vaccination Coverage among children aged 12 –23 months, disaggregated by the states of North East Zone of Nigeria & Sex - attached as a PDF file
Valid Vaccination Coverage - North East
  • Table 6: Valid Vaccination Coverage among children aged 12 –23 months, disaggregated by the states of South South Zone of Nigeria & Sex - attached as a PDF file
Valid Vaccination Coverage_South South
  • Denominator: Sum of weights for all respondents
  • Numerator: Sum of weights for respondents who received a valid dose per homebased record
  • Interpretation: “% of the population who were eligible for the survey are estimated to have a documented record of vaccinations (homebased record) and to have received a valid dose of Pentavalent 3.”
  • Weighted indicator
  • Reference: VCQI results Interpretation Quick reference Guide

“valid dose”

  • The child had reached the minimum age of eligibility for this dose.
  • If the schedule specifies a maximum age of eligibility, then the child was within the allowable age range when they received the dose.
  • If the dose is number 2 or 3 (or higher) in a sequence, then the minimum interval had passed since receiving the earlier dose, so the child was eligible to receive the next dose.

d) Card availability

  • Table 7: Card availability among children aged 12 –23 months, disaggregated by the states of North East Zone of Nigeria & Sex - attached as a PDF file
Card Availability- North East
  • Table 8: Card availability among children aged 12 –23 months, disaggregated by the states of South South Zone of Nigeria & Sex - attached as a PDF file
Card Availability_South South
  • Denominator: Sum of weights for all respondents
  • Numerator: Sum of weights for respondents who show a home based record with 1+ vaccination dates on it
  • Interpretation: % of the population who were eligible for the survey are estimated to have home-based record (card) with one or more vaccination dates on it.
  • Weighted
  • Reference: VCQI results Interpretation Quick reference Guide
  1. Fully vaccinated
  • Table 9: Fully Vaccinated among children aged 12 –23 months, disaggregated by the states of North East Zone of Nigeria & Sex
  • Table 10: Fully Vaccinated among children aged 12 –23 months, disaggregated by the states of South South Zone of Nigeria & Sex
  • Denominator: Sum of weights for all respondents
  • Numerator: Sum of weights for respondents who received all the doses in the list that makes up “fully vaccinated” 
  • Interpretation: Percentage of the population who were eligible for the survey are estimated to be fully vaccinated, with <either crude or valid doses> having received <list of doses to be fully vaccinated>.
  • Weighted indicator
  • Reference: VCQI results Interpretation Quick reference Guide

Design Effect (DEFF)

  • The ratio of the achieved variance to the variance that would have been observed with a simple random sample is known as the design effect.
  • By dividing the actual sample size by the design effect, the effective sample size is calculated.
  • This is the number of respondents would have to enroll in a simple random sample to achieve the same variance or the same confidence interval width that you achieved with the complex sample. 

Intracluster Correlation Coefficient (ICC)
The ICC  measures the correlation of the outcome within clusters in the sample. 

4. Results for the crude vaccination coverage table

  • Calculated the results for crude pentavalent 3 coverage among male and female children aged 12-23 months in states of the North East and South South zones using SPSS statistical software.
  • Table with results attched as a PDF document and the syntax of the program that calculates the results attached as a text file.
Syntax
syntax
  •  
Penta3 Coverage_North East
Penta3 Coverage_South South

 

5. Graphical summary of crude pentavalent 3 coverage

Graph created using the SPSS software descriptive statistics

FREQUENCIES VARIABLES=Crude_penta3_coverage_states level2name
/NTILES=10
/PERCENTILES=100.0
/BARCHART FREQ
/ORDER=ANALYSIS.

 

Crude pentavalent 3 coverage among children aged 12-23 in states of the North East & South South Zones

 

6. Methods summary

Dataset used for creator project is from the combined Multi-Indicator Cluster Survey and National Immunization Coverage Survey, which was conducted in 2016 and 2017 in Nigeria. The dataset includes all responses from the six states in the North East zone and the six states in the South South zone. 

North East Zone  South South Zone 
Adamawa  AkwaIbom 
Bauchi Bayelsa 
Borno CrossRiver 
Gombe Delta
Taraba Edo
Yobe Rivers

Data was collected using tablets and interviewers took photographs of the home based records of the children. But the interviewers did not collect data from health facilities. 

The dataset includes the survey data and the derived variables for a large number of coverage indicators, as described in the “VCQI Indicator List with Specifications - v1.9”.

Data cleaning and weighting done as described above.

Crude coverage for Pentavalent 3 was calculated using SPSS statistical software for all states and the North East and South South Zones of Nigeria. Crude coverage for Pentavalent 3 for female and male children also calculated seperately for all the above mentioned states and zones. 

Penta 3 crude coverage = Sum of weights for respondents who received the Pentavalent 3 rd dose per home based record or recall / Sum of weights for all respondents * 100%

 

7. Results summary

  1. Penta 3 coverage is higher in the states of the South South zones than the states in the North east Zones
  2. Edo is the state with the highest penta3 coverage among the 12 states, while Yobe having the lowest penta 3 coverage
  3. Penta 3 coverage is higher among the female children aged between 12-23 months comparing to the same age male children in both zones.
  4. Penta 3 coverage estimates derived from the number of respondents above 49 in all states

8. Caveats or concerns

  1. Steering Committee should pay attention to find out the reason for low penta3 coverage among children in North East zone eg: Geographical limitation, civil war
  2. Steering Committee should take necessary actions to improve the vaccination coverage in North East Zone according to the reason found out. eg: Conducting a special surveillance programme  in areas with low coverage like Yobe state.
    Should pay special attention to those who have  barriers for vaccination and should find out causes for non-vaccination and dropouts. 
  3. Steering committee should take necessary action to increase the awareness of the vaccination among general public and improve the quality of health care service
  4. There is a descrepency of Penta 3 coverage among Male and female gender. Steering commitee should find out the reason behind that.
  5. Should pay attention to collect data from the records and registers at the health care institutions.
  6. Should continue the improving data quality and survey methods

9. Strengths

  1. The results of the Penta 3 coverage clearly indicate the vaccination coverage difference between the two zones. Therefore it is obvious that there should be underlying cause for the low coverage in North east zone

Strenghths of the survey method

  1. Used tablets for data collection instead of manual paper based system. Therefore validations points can apply at the time of collection of data which enhance the quality of data and efficiency of data entry process
  2. Improvement of quality of data at the  data entry point, reduce the time and resources needed for data cleaning.
  3. Interviewers took photographs of the homebased record. These photographs can be used to re-check the correctness of the already entered data.
  4. The data were collected by a Multiple Indicator Cluster Survey (MICS) team, so the quality of household listing should be high.
  5. Availability of VCQI Forms and Indicator List again improve the quality of the data and efficiency of the entire process by standardize approach.
  6. By knowing the outcome for every household that was selected and the probability of selection of the clusters and the households and every eligible respondent should have been interviewed, we have good data for calculating design weights and for adjusting for non-response at the household level 
  7.  

10. Limitations

  1. The data collected only from the home based records and the caregiver's recall. But the accuracy of the coverage estimates will be improved by adding the data from the records and registers of the health facilities.
  2. Eventhough the photographs of the home based health record is available, illegible hand writing ,poor quality and different formats of the health record will not support the re-checking