Thursday, March 31, 2011

Defining the reading level of a text

How does one estimate the accessibility of a text? What leads to labels such as "This is at the 9th grade reading level"?

Apparently one big factor is the distribution of the number of syllables in the words. Another factor is the length of sentences. Here's a typical formula:
You can use a formula to calculate Flesch-Kincaid reading level on your own. This is a good tool to determine whether a book is going to challenge you.
1. Select a few paragraphs to use as your base.
2. Calculate the average number of words per sentence. Multiply the result by 0.39
3. Calculate the average number of syllables in words (count and divide). Multiply the result by 11.8
4. Add the two results together
5. Subtract 15.59

More on http://www.micropowerandlight.com/rd.html

I have reservations about these tools, at least as applied to material for adults. A text is difficult if it contains many unknown words, of course. One of the tools proposed maintains a database of words; perhaps they are ranked according to frequency in the English language, and one reasonable measure of difficulty would be the frequent occurrence of rare words.

But a text is also difficult if its structure is convoluted. Doing a grammatical analysis and parenthesizing relative clauses, one could compute the levels of nested parentheses, thus measuring the number of partial sentences that the reader must simultaneously keep in mind as he or she is reading. That should be an important factor in readability.

Examining vocabulary is similar to examining the number of notations that the reader of a scientific paper must familiarize himself with and must memorize as he is reading. Studying the nested structure is similar to studying the number of assumptions that the reader must keep in mind while processing a proof ("This is a proof by contradiction, with a case-by-case analysis, and we are in the "else" part of the "if" statement of case 2"...). Shouldn't there be a rigorous way to measure and analyze those parameters in readability?