IMPORTANT: THIS PROJECT IS ONLY FOR SURVEY ANALYSTS.
Your Creator assignment is to draft an analysis plan that contains the following sections and tasks.
DO NOT START THIS PROJECT IF YOU ARE A SURVEY MANAGER.
The dataset used for this project is from the combined Multi-Indicator Cluster Survey and National Immunization Coverage Survey that was conducted in 2016 and 2017 in two zones of Nigeria.
The data cleaning is a time consuming step when performing over all variables and all records. But considering the importance of this step adequate time and resources should be allocated ( at least one week for this survey). For the data cleaning it is recommended to use appropriate statistical software eg: SPSS, STATA,
- Specially ID variables should be checked for uniqueness, completeness and missing data– Stratum ID, Cluster ID, Household ID, Respondent ID
- eg: Sorting the dataset by the ID variables to identify missing values and duplicates
- Specially Date Variables- date of birth, vaccination dates, age at vaccination etc. eg: Child’s Birthdate should not be a nonsensical combination of numbers like February 30 or partial number or recorded differently in different health records (date recorded by history should check with the photograph of the home-based reports if available), Date variables should be checked for acceptable range (eg: Child's age: should be within12-23 months at the time of interview, date of interview should fall within the dates that interview team visited the particular cluster)
- Flag disallowed & questionable responses (Implausible or illogical response) for review eg: If the record says the respondent showed the vaccination card, there should be at least one tick mark on the vaccine or vaccine dose
- Check the flagged responses with the photograph of the home-based reports if it is not available or incase of discrepency call & clarify with the respondent
- Change uncorrected values / improbable values to missing (Data manager & survey manager decide on consistent policy regarding errant values that cannot be checked)
- number of errant values
- method used to check
- how many values corrected
- justification for correction
- how many couldn’t be corrected
- Calculate the design weight
- Adjusting for nonresponse
- Post-stratifying to match population totals
Design weight of respondent A = 1 / Probability of selection of respondent A in survey sample (PA)
PA = Stage I Probability * Stage II Probability * Stage III Probability * Stage IV Probability
- Stage I probability = Probability of selecting the respondent A’s stratum for the survey from the all possible strata of the sampling frame
- Stage II probability = probability of selecting respondent A’s cluster for the survey from the list of all clusters of the respondent A’s stratum
- Stage III probability = probability of selecting respondent A’s household for the survey from the list of households in the respondent A’s Cluster
- Stage IV probability = probability of selecting the respondent A for the survey from all the eligible respondents of the household
- Response weight of respondent = 1 / Response Rate
- Response Rate = Stage I response rate * Stage II response rate * ()
- Response Rate = Number of eligible with a complete interview / Number of eligible per stratum
- How many households were not interviewed despite repeated visits
- How many eligible respondents did not participate
- Number of eligible respondents in each household in the survey sample, as identified by an occupant of the household (preferred) or by a neighbour
Final weights = Design weights * Response Weights
a) Crude Coverage
b) Dropout
Reference: VCQI results Interpretation Quick reference Guide
c) Valid Coverage
“valid dose”
- The child had reached the minimum age of eligibility for this dose.
- If the schedule specifies a maximum age of eligibility, then the child was within the allowable age range when they received the dose.
- If the dose is number 2 or 3 (or higher) in a sequence, then the minimum interval had passed since receiving the earlier dose, so the child was eligible to receive the next dose.
d) Card availability
Design Effect (DEFF)
Intracluster Correlation Coefficient (ICC)
The ICC measures the correlation of the outcome within clusters in the sample.
Graph created using the SPSS software descriptive statistics
FREQUENCIES VARIABLES=Crude_penta3_coverage_states level2name
/NTILES=10
/PERCENTILES=100.0
/BARCHART FREQ
/ORDER=ANALYSIS.
Dataset used for creator project is from the combined Multi-Indicator Cluster Survey and National Immunization Coverage Survey, which was conducted in 2016 and 2017 in Nigeria. The dataset includes all responses from the six states in the North East zone and the six states in the South South zone.
North East Zone | South South Zone |
---|---|
Adamawa | AkwaIbom |
Bauchi | Bayelsa |
Borno | CrossRiver |
Gombe | Delta |
Taraba | Edo |
Yobe | Rivers |
Data was collected using tablets and interviewers took photographs of the home based records of the children. But the interviewers did not collect data from health facilities.
The dataset includes the survey data and the derived variables for a large number of coverage indicators, as described in the “VCQI Indicator List with Specifications - v1.9”.
Data cleaning and weighting done as described above.
Crude coverage for Pentavalent 3 was calculated using SPSS statistical software for all states and the North East and South South Zones of Nigeria. Crude coverage for Pentavalent 3 for female and male children also calculated seperately for all the above mentioned states and zones.
Penta 3 crude coverage = Sum of weights for respondents who received the Pentavalent 3 rd dose per home based record or recall / Sum of weights for all respondents * 100%
The results of the Penta 3 coverage clearly indicate the vaccination coverage difference between the two zones. Therefore it is obvious that there should be underlying cause for the low coverage in North east zone
Strenghths of the survey method