Influence of linguistic features on teacher judgments of English argumentative essays

Promovierende: Cristina Vögelin,
Teacher judgments, text characteristics, writing assessment, second language writing
Prof. Dr. Stefan Keller, Prof. Dr. Heike Behrens, Prof. Dr. Albert Düggeli
HS 2016 – FS 2019

The planned dissertation will be part of the study "ASSET - Assessing Students' English Texts" which will investigate teacher judgments of authentic English learner essays (SNF-DFG Project by Prof. Dr. Stefan Keller (PH FHNW) and Prof. Dr. Jens Möller (University Kiel). In an extension of the heuristic model by Südkamp et al. (2012), ASSET studies teacher-, student-, text-, and judgment-characteristics in 10 related empirical studies focusing on argumentative essays as the key genre of English upper-secondary education. The ability to judge learner texts is central for teaching English as a foreign language (EFL), especially in higher education  (Keller, 2013). As a key component of EFL teachers’ diagnostic competence, assessing essays plays an important role in the selection of classroom activities, in giving information on students' educational progress and thus forms the basis for decisions on grades, transfers to a next level and diplomas (Baumert & Kunter, 2006; Kronig, 2007; Schrader, 2013). There is empirical evidence on systematic errors affecting teacher judgment in general, for example tendencies to extreme or middle judgments, distortion-, bias- or halo-effects (Ingenkamp & Lissmann, 2008; Rezaei & Lovorn, 2010). Studies have shown that teachers attend more extensively to formal language issues in their assessment of student essays in contrast to other aspects such as organisation or structure (Cumming et al., 2002; Porsch, 2010). By systematically varying different characteristics in authentic learner texts, we are able to examine the influence of one determinant, such as the quality of spelling, on other aspects of teacher judgments. The planned dissertation will concentrate on the influence of text characteristics such as spelling, vocabulary (measured by lexical diversity and lexical sophistication) and grammar on teacher judgments.

Research questions and hypotheses
The research question for all three empirical studies is the following:

  • How does the differing quality of text characteristics such as spelling, vocabulary and grammar affect teacher judgments of English learner essays?

In case of the spelling study, the resulting hypotheses read: (1) Texts with a low quality of spelling will be assessed more negatively on all assessment criteria than texts with higher quality. (2) The manipulation check will exhibit different judgments of spelling for texts with divergent quality of spelling. (3) More negative holistic assessments are to be expected for texts with low quality of spelling since spelling shall be considered as a part of the assessment. (4) Lower quality of spelling will lead to more negative analytic evaluations on other, distinct criteria, indicating distortions.

From a dataset of authentic learner texts at upper-secondary level, experts evaluated 15 texts both holistically and analytically and chose two texts with an overall low text quality and two texts with an overall high text quality. In a next step, the quality of one text characteristic, such as spelling, was systematically varied between these texts: high (1 mistake per 100 words) versus low (7 mistakes per 100 words).

Using an online survey instrument similar to the "student inventory" by Kaiser et al. (2015), participants first received background information on the school-context and were presented to the holistic and analytic assessment rubric. The participants were then asked to assess the texts using both assessment rubrics. The last part of the survey included a background questionnaire. Participants were students of English in teacher education programmes in Basel (CH), Zürich (CH), Kiel (D) and others. In this study, we use an experimental 2 cross 2 design plan with the factors text quality (high/low) and spelling (high/low) analysed with a repeated-measures MANOVA. Overall, the dissertation aims to acquire a sample size of at least 105 participants for all three studies.

Baumert, Jürgen, and Mareike Kunter. 2006. "Stichwort: Professionelle Kompetenz von Lehrkräften." Zeitschrift für Erziehungswissenschaften 4. 469-520.

Cumming, Alister, Robert Kantor, and Donald E. Powers. 2002. "Decision Making while Rating ESL/EFL Writing Tasks: A Descriptive Framework." The Modern Language Journal 86. i: 67-96.

Ingenkamp, Karlheinz, and Urban Lissmann. 2008. Lehrbuch der Pädagogischen Diagnostik. Weinheim: Beltz Verlag.

Kaiser, Johanna, Jens Möller, Friederike Helm, and Mareike Kunter. 2015. "Das Schülerinventar: Welche Schülermerkmale die Leistungsurteile von Lehrkräften beeinflussen." Zeitschrift für Erziehungswissenschaften 18. 279-302. doi: 10.1007/s11618-015-0619-5.

Keller, Stefan. 2013. Integrative Schreibdidaktik Englisch für die Sekundarstufe - Theorie, Prozessgestaltung, Empirie. Tübingen: Narr Verlag.

Kronig, Winfried. 2007. Die systematische Zufälligkeit des Bildungserfolgs. Bern: Haupt Verlag.

Porsch, Raphaela. 2010. Schreibkompetenzvermittlung im Englischunterricht in der Sekundarstufe I - Empirische Analysen zu Leistungen, Einstellungen, Unterrichtsmethoden und Zusammenhängen von Leistungen in der Mutter- und Fremdsprache. Münster: Waxmann.

Rezaei, Ali Reza, and Michael Lovorn. 2010. "Reliability and validity of rubrics for assessment through writing." Assessing Writing 15. 18-39.

Schrader, Friedrich-Wilhelm. 2013. "Diagnostische Kompetenz von Lehrpersonen." Beiträge zur Lehrerbildung 31. 2: 154-165.