Assessment for Learning MOOC’s Updates
Intelligence Tests: The First Modern Assessments (Admin Update 1)
Intelligence versus knowledge testing - what are the differences in assessment paradigm? A good place to begin to explore this distinction is the history of intelligence testing - the first modern form of testing:
And if you would lile to read deeper into a contemporary version of this debate, contrast Gottfredson and Phelps with Shenk in the attached extracts.
Comment: What are the differences between testing intelligence and testing for knowledge? When might each approach be appropriate or innappropriate?
Make an Upate: Find an example of an intelligence test, and explain how it works. Analyze its strengths and weaknesses as a form of assessment.


The WISC-V is an intelligence test for children that measures multiple cognitive domains, including verbal comprehension, visual-spatial skills, fluid reasoning, working memory, and processing speed. It provides reliable, standardized scores that help identify learning strengths, weaknesses, and guide educational planning. However, it may be culturally biased, does not assess creativity or motivation, can be time-consuming, and may cause test anxiety.
The distinction between intelligence testing and knowledge testing is rooted in the history of modern assessment, beginning with Alfred Binet’s early-20th-century development of the first practical intelligence test. Binet designed the Binet-Simon Scale not to measure accumulated knowledge or innate worth, but to identify children who needed additional educational support by assessing reasoning, memory, and everyday problem-solving—abilities he viewed as universal cognitive processes rather than products of schooling. This established an assessment paradigm focused on measuring *how* people think rather than *what* they have learned. Intelligence tests therefore aim to capture general mental ability—fluid reasoning, pattern detection, and problem-solving—through standardized tasks that minimize reliance on prior instruction. In contrast, knowledge testing arose from long traditions of educational and civil service examinations designed to evaluate mastery of specific content, skills, or curricula. Knowledge assessments measure past learning, while intelligence tests predict learning potential. The paradigms diverge accordingly: intelligence testing emphasizes cognitive processes and standardization to infer underlying mental capacity, whereas knowledge testing emphasizes content coverage, instruction, and performance on taught material. These historical roots continue to shape modern assessment practices, influencing when each type of test is appropriate. Intelligence testing is most useful for understanding cognitive development, diagnosing learning challenges, or assessing reasoning independently of background, while knowledge testing is more appropriate for evaluating academic achievement or professional competence. Misusing either—such as treating knowledge tests as measures of innate intelligence or using intelligence tests for culturally diverse high-stakes decisions—highlights why understanding the origins of these paradigms remains essential.
Testing intelligence looks at how students think and solve problems, while testing knowledge checks what they remember or understand from lessons. Intelligence tests are useful for assessing reasoning skills but not for measuring specific content. Knowledge tests help check lesson learning but shouldn’t be the only basis for evaluating a student’s abilities. As a high school teacher, using both can give a clearer picture of student learning.
Raven’s Progressive Matrices (RPM) is an example of intelligence test used in science and psychology to measure a person’s ability to see patterns and solve problems. It works by showing visual puzzles where one piece is missing, and the student chooses the correct piece that completes the pattern. It does not require language or prior knowledge. This kind of test is fair for different learners because it is nonverbal, measures reasoning and problem-solving skills and very useful for identifying thinking ability or potential.
Intelligence tests are typically used for diagnostic purposes, such as identifying learning needs or giftedness. However, they should not be used for high-stakes decisions or with culturally mismatched tools. Knowledge tests, on the other hand, are curriculum-aligned and useful for grading, tracking progress, and identifying content gaps. However, they can be misleading if used as sole measures of ability or if a single test score is used to make major decisions.
An early example of an intelligence test is the Binet–Simon Intelligence Test, developed by Alfred Binet and Théodore Simon in France and later adapted and promoted in the United States by Henry Goddard and others.
Binet and Simon created age‑graded tasks
Example: following simple commands, naming objects, repeating digits, solving basic reasoning problems to see what most children of a given age could do.
A child’s “mental age” was estimated by the highest age level of tasks they could successfully perform, then compared to their chronological age to judge whether they were ahead, on level, or behind typical development.
The original purpose was practical and educational: to identify children in French schools who needed special help so teachers could provide extra support, not to permanently label or rank all children.
Strengths of Binet’s test (and its early use)
It was one of the first systematic, standardized tools to identify children with significant learning difficulties, allowing schools to provide targeted help instead of relying only on teacher impressions.
The age‑based “mental age” idea helped educators see development as gradual and measurable, and it laid the foundation for later norm‑referenced tests such as the Stanford–Binet and modern Wechsler scales.
Weaknesses and problems, especially in Goddard’s use
The test reflected the language, culture, and schooling of French (and later American) children, so when Henry Goddard used it with immigrants and diverse groups, low scores often reflected language barriers and limited schooling rather than true intellectual limits.
Goddard and others sometimes treated test results as fixed measures of innate ability and used them to justify segregation, institutionalization, and restrictive immigration policies, far beyond Binet’s original educational intention.
As an assessment, the Binet test focused on a narrow band of cognitive tasks and did not capture broader aspects of intelligence such as creativity, practical skills, or social understanding, so overreliance on the score could give a distorted picture of a person’s capacities.
One widely used intelligence test is the Wechsler Intelligence Scale for Children (WISC) (for children) / Wechsler Adult Intelligence Scale (WAIS) (for adults). These tests are designed to assess a range of cognitive abilities — verbal skills, reasoning, memory, processing speed, and more — rather than a single dimension of “intelligence.”
For example, when administered to a child, the WISC typically includes multiple subtests. These may involve: answering questions about general knowledge or vocabulary; solving puzzles or block‑design problems; arranging pictures or completing mazes; memory tasks like repeating sequences of digits; and tasks that require quick processing or symbol matching. Each of these subtests taps into different cognitive domains.
After the child completes the subtests, raw scores are obtained for each. Those raw scores are then converted (via normative data) into standard scores. On the WISC/WAIS, the average (or “mean”) score is set at 100, with a standard deviation (variation) typically at 15 points. From the subtests and index scores, a “full-scale IQ” or composite intelligence score can be derived.
In this way, the test works by comparing an individual’s performance to a large, standardized sample (norm group) of same-age peers. This comparison allows the test to place the individual’s cognitive ability somewhere on a continuous scale, giving a sense of where the student stands relative to the norm.
What Such a Test Provides — From a Teacher’s Perspective
As a teacher (even one with little experience), using a test like WISC/WAIS offers several valuable pieces of information:
Cognitive profile and diversity of abilities. Because the test covers different domains — verbal reasoning, visual-spatial, memory, processing speed — it provides a detailed view of a student’s cognitive strengths and weaknesses. I can see which areas a learner excels in (e.g. good vocabulary or reasoning) and where they may struggle (e.g. slower processing speed or weaker working memory).
Benchmarking and comparability. The standardized scoring makes it possible to compare students to normative data. This helps me understand whether a student’s performance is below, average, or above the norm — which can guide decisions about support, enrichment, or placement.
Diagnostic insights. Especially for students facing learning difficulties or needing special support, the test’s detailed subtest scores may help identify whether difficulties arise from cognitive issues (e.g. memory, processing speed) rather than purely from lack of motivation or poor instruction.
Informed instructional planning. With knowledge of a student’s cognitive profile, I can tailor instruction. For instance, a learner with slower processing speed but strong verbal reasoning might benefit from more time and verbal explanations; one with strong spatial reasoning but weaker verbal memory might benefit from visual or hands‑on learning strategies.
Fairness and standardization. Because administration and scoring are standardized, the risk of teacher bias or subjective evaluation is reduced.
Strengths as a Form of Assessment
The WISC/WAIS (and similar well‑designed intelligence tests) offer several strengths:
Reliability and validity. The test has been extensively standardized, with psychometric evidence supporting its reliability and validity. Subtests are structured, scoring is consistent, and the normative sample provides a stable baseline for comparison. Sage Publications+2Open FL+2
Comprehensiveness. Rather than producing a single narrow score, the test captures multiple dimensions of cognitive functioning. This multidimensional profile is much more informative than a simple “overall judgment.”
Diagnostic utility. For identifying learning difficulties, cognitive delays, giftedness, or mismatches between ability and academic performance, the WISC/WAIS offers a tool that can inform diagnosis, interventions, and individualized support.=
Comparability across individuals and groups. Because of standardization, it's possible to fairly compare students of the same age, or to track cognitive development over time.
Weaknesses and Limitations — What Teachers Should Be Wary Of
Nonetheless, intelligence tests like WISC/WAIS have limitations and potential drawbacks:
Cultural and socio‑educational bias. Some subtests — particularly verbal ones — draw heavily on language, vocabulary, general knowledge, and education background. Students from different cultural, linguistic, or socioeconomic contexts may be disadvantaged.
Incomplete measurement of “intelligence.” Intelligence is a complex, multifaceted construct. Tests like WISC/WAIS emphasize certain types of cognitive skills (verbal reasoning, working memory, processing speed) but do not directly assess creativity, emotional intelligence, motivation, practical problem‑solving, social skills, or other forms of “intelligence” that matter in real life.
Dependence on test conditions and non-cognitive factors. Test performance may be influenced by anxiety, fatigue, motivation, familiarity with test format, language proficiency, or test-taking skills rather than pure cognitive ability.
Resource and administration demands. Administering WISC/WAIS properly requires trained professionals, time (often 1–2 hours individually), and costly test materials — which may be less feasible in many school settings, especially in under-resourced contexts.
Overemphasis on a single score or profile. There is a risk that educators, parents, or institutions treat the IQ score as definitive — ignoring its limitations, or using it for high-stakes decisions (tracking, streaming, gifted placement) without considering broader social, emotional, and contextual factors. Critics warn that such overreliance undermines a holistic view of learners.
As a Teacher: How I Would Use (or Critically Use) an Intelligence Test
If I were a teacher and had access to an intelligence test like WISC or WAIS, I would approach its use with both openness and caution.
I would use it as one piece of a broader assessment strategy: not as a definitive measure of a student’s worth or potential, but as a diagnostic tool to help understand cognitive strengths and challenges.
I would interpret results in context — taking into account the student’s background, culture, language, socioeconomic status, motivation, and other factors that may influence performance.
I would avoid labeling or limiting students purely based on an IQ score; instead I would use the results to tailor instruction, support, or enrichment.
I would combine the intelligence test with more extensive and diverse assessments — performance tasks, projects, observations, formative assessments — to capture a holistic picture of the student’s abilities and potential.
Conclusion
Intelligence tests like WISC and WAIS can be powerful tools for assessing cognitive abilities. For a teacher, they offer detailed, standardized, and reliable data that can inform instruction, diagnose learning needs, and guide individualized support. However, they are not perfect — their limitations in scope, cultural fairness, and sensitivity to external factors make it essential to interpret results thoughtfully and to use them as part of a broader, balanced assessment system.
In the end, these tests are best seen not as verdicts, but as diagnostic instruments — helpful maps of certain aspects of cognitive functioning that, when used carefully, can support student learning and growth.
Differences between testing intelligence and testing knowledge
Testing intelligence measures a person’s ability to reason, solve problems, and think abstractly. Intelligence tests aim to evaluate potential cognitive ability rather than what someone has memorized. Examples include pattern recognition, logical reasoning, and problem-solving tasks.
Testing knowledge, on the other hand, measures what someone has learned or memorized. This could include facts, procedures, or concepts in a particular domain, like a history quiz or math test. Knowledge tests are more content-specific and reflect learning outcomes rather than raw cognitive potential.
Appropriateness
Intelligence testing is appropriate for identifying learning needs, cognitive strengths and weaknesses, or potential for problem-solving in new situations. It can be inappropriate if used to label people, make hiring decisions without context, or ignore cultural/language differences.
Knowledge testing is appropriate when assessing mastery of course material, training outcomes, or specific skills. It can be inappropriate if used to infer overall intelligence or potential, since someone may know content but lack broader reasoning skills.
Example of an intelligence test
The Wechsler Adult Intelligence Scale (WAIS) is a widely used intelligence test for adults. It measures cognitive abilities across multiple domains, including:
Verbal comprehension (e.g., defining words, explaining similarities)
Perceptual reasoning (e.g., solving puzzles, analyzing visual information)
Working memory (e.g., remembering sequences of numbers)
Processing speed (e.g., completing timed symbol tasks)
Strengths:
Provides a comprehensive measure of cognitive ability.
Can identify strengths and weaknesses across different cognitive domains.
Widely validated and used in research and clinical practice.
Weaknesses:
Can be influenced by language, cultural background, and test-taking experience.
May not fully capture creativity, practical problem-solving, or social intelligence.
Can be stressful and intimidating, potentially affecting performance.
very informative
Testing intelligence focuses on measuring a person's ability to think, reason, and solve problems, while testing knowledge checks what a person has already learned, such as facts, skills, or subject content. Knowledge is acquired information and skills, while intelligence is the innate ability to apply that knowledge to solve problems and adapt to new situations.
You can increase your knowledge through education and experience, but intelligence is the underlying capacity to learn and reason. Think of it this way: knowledge is having the information, but intelligence is what allows you to use it effectively.
Some examples of intelligence tests include reasoning, memory tasks, letter, number , symbol sequencing.
Intelligence tests offer strengths like providing standardized, numerical measures for educational placement, identifying cognitive strengths/weaknesses, and predicting academic success. It can be used for selection, classification, promotion and research but have limitations as scores may vary by test and performance is impacted by temporary external factors.
These are all in relation to the discussion based on @Ara Mustacisa, @Rosalyn Odullada, @Ida Gua.
Thank you
very informative
Testing intelligence focuses on measuring a person's ability to think, reason, and solve problems, while testing knowledge checks what a person has already learned, such as facts, skills, or subject content. Knowledge is acquired information and skills, while intelligence is the innate ability to apply that knowledge to solve problems and adapt to new situations.
You can increase your knowledge through education and experience, but intelligence is the underlying capacity to learn and reason. Think of it this way: knowledge is having the information, but intelligence is what allows you to use it effectively.
Some examples of intelligence tests include reasoning, memory tasks, letter, number , symbol sequencing.
Intelligence tests offer strengths like providing standardized, numerical measures for educational placement, identifying cognitive strengths/weaknesses, and predicting academic success. It can be used for selection, classification, promotion and research but have limitations as scores may vary by test and performance is impacted by temporary external factors.
These are all in relation to the discussion based on @Ara Mustacisa, @Rosalyn Odullada, @Ida Gua.
Thank you
In the history of educational assessment, a fundamental distinction exists between measuring what a student can do (potential) and what a student has done (achievement). As the community discussion on Intelligence Tests: The First Modern Assessments highlights, confusing these two paradigms leads to flawed educational decisions. Testing for intelligence aims to measure the engine of the mind—cognitive processes like fluid reasoning, working memory, and pattern recognition—independent of formal schooling. In contrast, testing for knowledge measures the cargo—the specific facts, vocabulary, and skills a learner has acquired through instruction and effort. As @SandralynJumadil succinctly summarizes, the former focuses on the ability to solve novel problems, while the latter measures what has already been learned from experience.
The consensus among community members is that intelligence testing is most appropriate when used as a diagnostic tool rather than a ranking device. @JohnOpre argues that these assessments are best suited for identifying specific learning needs or diagnosing cognitive strengths and weaknesses (such as in Special Education evaluations) because they reveal how a person thinks, not what they have been taught. However, @SanchiaMayBatica and @JusthineKayloLao rightly warn that these tests become inappropriate when used to label students or make high-stakes decisions without considering cultural context. If a test relies on cultural metaphors unfamiliar to a student, it ceases to measure intelligence and instead measures privilege.
Conversely, knowledge testing is the appropriate tool for measuring curriculum mastery, certification, and instructional effectiveness. It answers the question, Did the student learn what we taught? However, its inappropriate use occurs when we try to use it as a proxy for intelligence. @JenniferSaul cautions that knowledge tests become dangerous when they encourage rote memorization or when they are used as the sole basis for judging ability. A low score on a history exam might not mean a student lacks cognitive power; it might simply mean they lacked access to resources or effective teaching.
Ultimately, a balanced assessment system requires us to respect the difference between the tool and the work. As @JusthineKayloLao notes, intelligence tests reveal potential, while knowledge tests reveal achievement. The challenge for educators is to ensure we do not use a knowledge test to judge a student's future, nor an intelligence test to grade their past effort.
In the history of educational assessment, a fundamental distinction exists between measuring what a student can do (potential) and what a student has done (achievement). As the community discussion on Intelligence Tests: The First Modern Assessments highlights, confusing these two paradigms leads to flawed educational decisions. Testing for intelligence aims to measure the engine of the mind—cognitive processes like fluid reasoning, working memory, and pattern recognition—independent of formal schooling. In contrast, testing for knowledge measures the cargo—the specific facts, vocabulary, and skills a learner has acquired through instruction and effort. As @SandralynJumadil succinctly summarizes, the former focuses on the ability to solve novel problems, while the latter measures what has already been learned from experience.
The consensus among community members is that intelligence testing is most appropriate when used as a diagnostic tool rather than a ranking device. @JohnOpre argues that these assessments are best suited for identifying specific learning needs or diagnosing cognitive strengths and weaknesses (such as in Special Education evaluations) because they reveal how a person thinks, not what they have been taught. However, @SanchiaMayBatica and @JusthineKayloLao rightly warn that these tests become inappropriate when used to label students or make high-stakes decisions without considering cultural context. If a test relies on cultural metaphors unfamiliar to a student, it ceases to measure intelligence and instead measures privilege.
Conversely, knowledge testing is the appropriate tool for measuring curriculum mastery, certification, and instructional effectiveness. It answers the question, Did the student learn what we taught? However, its inappropriate use occurs when we try to use it as a proxy for intelligence. @JenniferSaul cautions that knowledge tests become dangerous when they encourage rote memorization or when they are used as the sole basis for judging ability. A low score on a history exam might not mean a student lacks cognitive power; it might simply mean they lacked access to resources or effective teaching.
Ultimately, a balanced assessment system requires us to respect the difference between the tool and the work. As @JusthineKayloLao notes, intelligence tests reveal potential, while knowledge tests reveal achievement. The challenge for educators is to ensure we do not use a knowledge test to judge a student's future, nor an intelligence test to grade their past effort.