Abstracts for ECOLT, MwALT, ELTS conferences in October 2021

2021/08/25

ECOLT 2021: Developing a binary-branching decision-tree rubric for assessing speaking: case of ENGin program

Daniil M. Ozernyi (Department of Linguistics, Northwestern University; Assessment and Evaluation, ENGin)
Soe Young Lee (Assessment and Evaluation, ENGin)

Approaches to designing rating rubrics have differed significantly over the years, separating intuitively-driven and empirically-driven rating scales (Council of Europe 2001; though notably not 2020). The latter can be subdivided into empirical quantitative and empirical qualitative (Galaczi et al. 2011). On the other hand, there emerged various approaches to structuring the rating scales: linear (as CEFR or Cambridge’s are), performance decision trees involving a number of subtrees (Fulcher et al. 2011), or single binary-branching decision tree (Turner and Upshur 2002; EBB for “Empirically-derived Binary-choice, Boundary definition”).

In the present study, we outline the development process of an EBB-like rubric with five subscales (“Grammar”, “Vocabulary”, “Fluency, Coherence, and Development”, “Interactive Communication”, and “Pronunciation”). We set the design of the scale in context of the primary purpose: assessment of Ukrainian adolescents with levels ranging from A0 to C1 (ENGin program participants). Then, we motivate the choice of EBB format for its simplicity for raters, and outline the main stages of development of the rubric (following, in part, Galaczi et al. 2011):

REFERENCES

  1. Council of Europe. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Council of Europe Publishing, 2001, www.coe.int/lang-cefr.
  2. —. Common European Framework of Reference for Languages: Learning, Teaching, Assessment – Companion Volume. Council of Europe Publishing, 2020, www.coe.int/lang-cefr.
  3. Galaczi, Evelina D., et al. “Developing Assessment Scales for Large-Scale Speaking Tests: A Multiple-Method Approach.” Assessment in Education: Principles, Policy & Practice, vol. 18, no. 3, Routledge, Aug. 2011, pp. 217–37. Taylor and Francis+NEJM, doi:10.1080/0969594X.2011.574605.
  4. Turner, Carolyn E., and John A. Upshur. “Rating Scales Derived From Student Samples: Effects of the Scale Maker and the Student Sample on Scale Content and Student Scores.” TESOL Quarterly, vol. 36, no. 1, 2002, pp. 49–70. Wiley Online Library, doi:10.2307/3588360.

MwALT 2022: Vocabulary profiling of Polish “Matura”: corpus-based inquiry into content validity for high-stakes eastern European state ESL examination

Daniil M. Ozernyi (Department of Linguistics, Northwestern University)

Keywords: EVP, CEFR, reading, assessment, Matura

English Vocabulary Profile as a part of English Profile Project by Cambridge Assessment English (Harrison and Barker, 2015; Capel, 2012; Kurtes and Saville 2008) is a result of work, following the T-series, to draw corpus-based lines between lexicons of one CEFR level (for ESL) against the other ones. This study’s main objective is to attempt at using EVP to assess content validity of Polish state Matura examination for the level-appropriateness of the reading text it uses.

The experimental parts involved a corpus of Matura reading texts over the last 5 years. The total corpus amounted to 34 texts with a total token count of 12 700. The texts were subjected to analysis by the Text Inspector online tool, associated with Cambridge Assessment English. The following metrics were collected: SentenceLength(average), type/token ratio, SyllablePerSent(average), types per level (A1-C2, %), tokens per level (A1-C2, %), unlisted types and tokens. The threshold for level-appropriate type/token values were taken from a similarly-purposed and validated B1 Preliminary exam by Cambridge English.

The results revealed that Matura’s reading texts are unfit to discriminate between B1 and B2 levels, and are only suitable for B2/C1 discrimination; another findings was that for a B2 exam, none of the A1-B1, or C2 type/token counts can reliably predict content validity. The findings can be used as a validation tool of reading texts and gain insight into the current state of Slavic state high-stakes ESL examinations. Given universal need and applicability of means to determine and measure content validity, both in large-scale, but even more so, in local low-stakes examinations, further research is needed to better calibrate the thresholds for every CEFR level.

REFERENCES

  1. Capel, A. (2012). Completing the English vocabulary profile: C1 and C2 vocabulary. English Profile Journal, 3.
  2. Harrison, J., & Barker, F. (2015). English profile in practice (Vol. 5). Cambridge University Press.
  3. Kurtes, S., & Saville, N. (2008). The English Profile Programme–an overview. Research Notes, 33, 2–4.
  4. Leńko-Szymańska, A. (2015). The English Vocabulary Profile as a benchmark for assigning levels to learner corpus data. Learner Corpora in Language Testing and Assessment, 115–140.
  5. Text Inspector: Analyse the Difficulty Level of English Texts. (n.d.). Text Inspector. Retrieved March 14, 2021, from https://textinspector.com/

ELTS 2021: EVP as a tool for evaluating the CEFR level-appropriateness of ESL reading examinations: corpus analysis of Ukrainian state assessment

Keywords: EVP, CEFR, reading, assessment, ZNO

English Vocabulary Profile as a part of English Profile Project by Cambridge Assessment English (Harrison and Barker, 2015; Capel, 2012; Kurtes and Saville 2008) is a result of work, following T-series, to draw corpus-based lines between lexicons of once CEFR levels against the other ones. Beyond its informativity per se, EVP has been shown to correlate with results, for example, of subjective writing assessment (Leńko-Szymańska, 2015). This study’s main objective is to assess Ukrainian state examination (ZNO) for the level-appropriateness of the reading text it uses, and thus ascertain its reliability.

The experimental parts involved a corpus of ZNO readings texts over the last 4 years which consisted of 7205 tokens or 2285 types. A total of sixteen texts, 440 words on average, were subjected to analysis by the Text Inspector online tool, associated with Cambridge English. The following metrics were collected: sentence count, token count, type count, average sentence length, type/token ratio, syllable count, average syllable per sentence, types per level (A1-C2, %), token per level (A1-C2, %), and unlisted types and tokens. In addition to analyzing every text separately, the entirety of corpus was analyzed as well. Since ZNO is a level-specific exam and aims at differentiating between B1 and B2 CEFR levels, the threshold for level-appropriate type/token values were taken from a similarly-purposed and validated B1 Preliminary exam by Cambridge English.

The results revealed that ZNO is unfit to discriminate between B1 and B2 levels, and is only suitable for B2/C1 discrimination. Significant Pearson/Kendall/Spearman correlations were found for A1 type/tokens vs unlisted type/tokens. The results of analysis also revealed that for B2 exam, none of the A1, A2, B1, or C2 type/token counts can reliably predict the level-appropriateness of the exam.

REFERENCES

  1. Capel, A. (2012). Completing the English vocabulary profile: C1 and C2 vocabulary. English Profile Journal, 3.
  2. Harrison, J., & Barker, F. (2015). English profile in practice (Vol. 5). Cambridge University Press.
  3. Kurtes, S., & Saville, N. (2008). The English Profile Programme–an overview. Research Notes, 33, 2–4.
  4. Leńko-Szymańska, A. (2015). The English Vocabulary Profile as a benchmark for assigning levels to learner corpus data. Learner Corpora in Language Testing and Assessment, 115–140.
  5. Text Inspector: Analyse the Difficulty Level of English Texts. (n.d.). Text Inspector. Retrieved March 14, 2021, from https://textinspector.com/