Abstract
Automated speech recognition (ASR) has gained popularity in language learning, enhancing learners’ pronunciation skills, vocabulary knowledge, and reducing speaking anxiety. However, relatively little research has investigated on the accuracy of ASR-generated feedback, and little do researches study Taiwanese students’ perceptions of ASR learning tools. This study examines the rating alignment between Whisper, an AI-based ASR system used in Taiwan’s Cool English platform, and human versions. Additionally, it explores Taiwanese students’ attitudes towards this technology. Two trained human raters compared the overall thirty rating results between Whisper’s versions and human’s versions. The results indicated that Whisper’s automated assessments generally align with human raters’ evaluations, especially in recognizing grammatical features such as articles and plural forms, while the tool also showed limitations in detecting complex stress patterns and specific pronunciation errors. Additionally, the study found overcorrection and undercorrection in Whisper’s evaluations, which could affect the validity of its feedback. Although the interview data revealed that participants mostly held positive attitudes toward the AI-based tool for its quick feedback and potential for self-controlled practice, some participants found certain feedback unclear or overly corrected, suggesting a need for system improvements. Insights from this study suggest technical adjustment for Whisper and pedagogical implications for AI-based ASR tool for English-speaking training and practice.
Details
Presentation Type
Theme
KEYWORDS
COOL ENGLISH PLATFORM, RATING ALIGNMENT, STUDENTS' PERCEPTION, ENGLISH SPEAKING