Jasmine Bennhr Tbingen, December 5th, 2011 - - PowerPoint PPT Presentation

jasmine benn hr t bingen december 5th 2011 http ww
SMART_READER_LITE
LIVE PREVIEW

Jasmine Bennhr Tbingen, December 5th, 2011 - - PowerPoint PPT Presentation

COMPOST Identification of indicators for comp etence assessment o f st udents essays: How do Textual Indicators Evolve? Jasmine Bennhr Tbingen, December 5th, 2011


slide-1
SLIDE 1

COMPOST

Identification of indicators for competence assessment of students’ essays:

How do Textual Indicators Evolve?

Jasmine Bennöhr Tübingen, December 5th, 2011

http://ww.linguistik.hu-berlin.de/institut/professuren/korpuslinguistik/forschung/kompost/compost

slide-2
SLIDE 2

Overview

I Introduction

Purpose, aims and examples

II Data III Methodology IV Results and implications V Work in Progress VI Future Work VII Conclusion

2

slide-3
SLIDE 3

I Introduction - Aim

Aim:

find (new/interesting) indicators for language quality in essays Measure how the indicators evolve over time

3

slide-4
SLIDE 4

I Introduction: Purpose

Purpose:

Identify pupils with special needs in language training

Side effects

Data for developing or improving competence models

slide-5
SLIDE 5

II Data – Essay Corpus - Origin

Essay corpus – collected during the longitudinal study KESS (Kompetenzen und Einstellungen von Schülerinnen und Schülern – competences and attitudes of pupils) Programme for student assessment KESS: complete survey of a year of pupils in Hamburg

Grades 4, 7, 8, 10 (and 12) in years 2003, 2006, 2007, 2009 (and 2011).

slide-6
SLIDE 6

II Data – Compost Essay Corpus - Overview Essays N available digitalised [1] N rated [2] Test results for validation KESS4 – 2003 (1 topic) 839

  • ca. 8000

KFT KESS7 – 2006 (2 topics) 126 (of appr. 1500) 63 and 63 Reading comprehension KESS8 - 2007 (13 topics) 1705 1705 C-test, grammar, vocabulary, spelling, reading comprehension KESS10 - 2009 (6 topics) 1189 Not yet rated, 1189 C-test, spelling, reading comprehension

slide-7
SLIDE 7

II Data: Extract from test booklet

Example: task from grade 4 Texts are digitized (typed manually) Interpretation begins when texts are digitalised

That is decisions at this point affect results

slide-8
SLIDE 8

III Methodology: Annotation and frequencies

Annotation

Operationalise features that shall be determined automatic annotation

Operationalisation which can be applied

  • automatically. How can features be identified?

Check quality of annotations Determine frequencies

slide-9
SLIDE 9

III Methodology: Annotation is interpretation

Only what is annotated can be counted

Interpretation is continued Errors can be inserted during annotation

slide-10
SLIDE 10

x: grade y: letters per word

KESS4 KESS7 KESS8 KESS10 4,1 4,2 4,3 4,4 4,5 4,6 4,7 4,8 4,9 5 5,1

word length

IV Results: Word length

slide-11
SLIDE 11

KESS4 KESS7 KESS8 KESS10 0,0000 1,0000 2,0000 3,0000 4,0000 5,0000 6,0000

commas

IV Results: Commas

x: grade y: commas per 100 words

slide-12
SLIDE 12

KESS4 KESS8 KESS10 0,0000 0,2000 0,4000 0,6000 0,8000 1,0000 1,2000 1,4000 1,6000 1,8000

  • ung

KESS4 KESS8 KESS10 0,0000 0,0500 0,1000 0,1500 0,2000 0,2500

  • keit

KESS4 KESS8 KESS10 0,0000 0,0100 0,0200 0,0300 0,0400 0,0500 0,0600 0,0700 0,0800

  • heit

x: grade y: -heit, -keit, -ung per 100 words

IV Results: -heit, -keit, -ung

slide-13
SLIDE 13

IV Results

Word length is one of the most reliable features Certain suffixes show an evolvement, but not all

slide-14
SLIDE 14

IV Results: Implications

Good starting point

But, from there we want to go further

Word length is a number, cannot be interpreted in terms of content/structure An approach that is motivated more by a linguistic point of view

Analysis of suffixes, problem: choice and data sparseness

Combine both  look at structure of words and how that develops over time

slide-15
SLIDE 15

V Work in Progress

Skim through tokens with high word length

Look at morphological structure, complexity

For simplicity we assume

Prefix, Suffix, Lexemes, Flexives

We want to look at combinations

  • E. g. Prefix + Lexeme + Suffix

Case study with prefix + lexeme + -ung High number of occurrences Example: <Auf><frisch><ung>

slide-16
SLIDE 16

x: grade y: forms per 100 words

V Work in Progress: Preliminary Results

KESS4 KESS7 KESS8 KESS10 0,2 0,4 0,6 0,8 1 1,2

<prefix><lexeme+|prefix*|suffix*><ung>

slide-17
SLIDE 17

V Work in Progress: Preliminary Results – KESS4

216 –ung 109 <prefix><lexeme+|prefix*|suffix*><ung> <Ver><mut><ung> <ent><vern><ung> <An><leit><ung>

slide-18
SLIDE 18

V Work in Progress: Preliminary Results - KESS8

1511 –ung 569 <prefix><lexeme+|prefix*|suffix*><ung> <An><leit><ung>,<ver><spät><ung>, <Ver><pflicht><ung> <Vor><wahrn><ung>, <er><källt><ung> <Um><satztsteiger><ung> False positives: <Er><derwärm><ung>

slide-19
SLIDE 19

V Work in Progress: Preliminary Results - KESS10

902 –ung 457 <prefix><lexeme+|prefix*|suffix*><ung> <Ab><mahn><ung>, <Ver><zweifl><ung>

slide-20
SLIDE 20

VI Future Work

Type/token ratio –ung bzw. <prefix><lexeme+|prefix*|suffix*><ung>

slide-21
SLIDE 21

VI Future Work

Focus: How do word structures of students develop? Prefix chains

<un><ent><schied><en>

Suffix chains

<Tät><ig><keit>,<Pünkt><lich><keit>

Combination of several prefixes and suffixes

<prefix><prefix><lexeme><suffix>

<un><be><greif><lich>,<un><ver><kenn><bar>

<prefix><lexeme><suffix><suffix> <Über><pünkt><lich><keit>

slide-22
SLIDE 22

V Conclusion: Summary

Evolvement of indicators over time From surface indicator word length and individual affixes to a more linguistically motivated analysis

Word length is not well interpretable but tightly linked to morphological structure Individual affixes (suffixes)

Structure of words

Qualitative analysis meaningful

How do students construct words and how does that develop over time?