2025 : 9 : 29

Leila Shoja

Academic rank: Assistant Professor
ORCID:
Education: PhD.
ScopusId:
HIndex:
Faculty: Literature and Humanities
Address:
Phone:

Research

Title
Investigating the Correlation of AI Ratings of Essays with TOEFL Scores
Type
Presentation
Keywords
AI-assisted language testing,chatbot essay ratings,computer-assisted language testing (CALT),natural language processing (NLP) in education,TOEFL score correlation
Year
2024
Researchers Leila Shoja ، Mohammad Mahdi Maadikhah

Abstract

Recent advancements in artificial intelligence (AI) and natural language processing (NLP) have revolutionized the methodology and practice in almost all subfields of applied linguistics. After the developments of the last four decades in computer-assisted language testing (CALT), the new era’s developments and progress in AI have transformed different areas of language education and teaching, including language testing and assessment. AI-assisted language testing is a relatively new, fast-evolving and currently developing realm, demanding inquiries and investigations on different aspects of using AI to support and enhance language ability assessment and measurement. This study aimed to investigate whether AI chatbots’ ratings of prospective TOEFL applicants’ writing would have a statistically meaningful relationship and correlate with their TOEFL scores. To this end, each of 35 learners in a TOEFL preparation course at a private language school in Kermanshah, Iran, was asked to submit an essay which was, along with a common rubric, given to AI chatbots, ChatGPT and Google Gemini to rate. The average (mean) rating score for each learner was then calculated. After the learners took TOEFL and their test scores were ready, Pearson’s correlation coefficient was used along with ttest to determine whether there is a statistically significant and meaningful relationship between the mean AI-generated rating score for each learner’s essay, and their TOEFL score. The data was processed and analyzed using SPSS software. The results showed that, there is a strong positive correlation (r=0.87) between mean essay rating score, and the TOEFL score for each learner. The results of this study suggest high correlation between learners’ essays rating scores generated by AI chatbots and the TOEFL scores of each of them. Further research with different AI chatbots, higher numbers of learners and applicants, and AI-generated scores for assessment and measurement of different skills, subskills and components are recommended.