13th June 2010

Readability Tester

Most Rix Centre projects are concerned with web content creation by and with people with intellectual disabilities. Generally Rix works with people who have mild and moderate intellectual disabilities and use camera created content supplemented by small runs of easy read text.

Studies typically find the mild and moderate ID adult population with median ‘reading age’ in the 6-8 years range – that’s chronological age, not grade level. Peak literacy levels for this group are often reached during formal schooling and fall back in adulthood.

Consequently it is rare for adults with ID to have functional literacy, if this taken as ‘lower secondary school age’ as per WAI or the minimum UK expectation for school leavers (most often quoted at 11 years). Purely text based information, even for the mild ID audience, is often inaccessible and text-to-speech does little to address the underlying comprehension issues which limit literacy for this group.

One option we considered was building readability tests into the various CMS so that content was automatically annotated with reading level as a QC reference for authors. The following model tests a block of plain text or will attempt to sample remote html. In practice, the arbitrary measures of syllables and sentence length [in research contexts these measures are always undertaken manually] were too easily scuppered by the randomness of English orthography, so this was never actually implemented, but it works reasonably well in the technical sense.

Easy read authors from special ed or speech and language backgrounds generally apply far more subjective judgement in their writing; although some rules are common –  no line breaks, radically limit the number of information carrying words, annotate all key concepts with a visual and so forth.

This checker employs Dave Child’s php-text-statistics class available from Google Code. Output is generally within the margin of error for the individual checks. Application specific editing of the supplemental word lists used for syllable calculation improves things a little.