Header text

EssayTagger is a web-based tool to help teachers grade essays faster.
But it is not an auto-grader.

This blog will cover EssayTagger's latest feature updates as well as musings on
education, policy, innovation, and preserving teachers' sanity.

Thursday, September 13, 2012

On teacher accountability, pt2: Possible compromises

In part 1 I laid out the case against the current method of teacher accountability via value-added analysis. Here I offer what I think are reasonable compromises.

This focus on quantifiable standardized test scores is not going to go away. Some form of accountability linked to test scores is unavoidable. Period. I leave it to the statisticians to refine the analysis and reduce that 53% margin of error.

But here are some practical solutions to incorporate this data while controlling for its flaws:

Agree upon an error margin threshold
I would feel comfortable with a 15% margin of error in the value-added analysis data. If the data has a margin of error of 15% or less, then it should be fair game to apply it to my performance evaluation. But if the data exceeds the 15% threshold, it should be rendered invalid because it is too inaccurate to be used in such a high-stakes evaluation.

I could even compromise on a 20% threshold. A 20% error margin means it's approximately 80% right; if we were grading their work, the statisticians would get a B-/C+ for their accuracy. Not bad. Respectable.

But the current "grade" for the statisticians for their accuracy of Math teachers' impact is a D (65%). They're straight up failing -- and badly -- in their calculations for English teachers' impact (47% accurate).

The statisticians need to improve their accuracy for their work to be taken seriously. A "D" is rarely acceptable and a 47% "F" is a damn disaster. Let's demand rigorous standards of our statisticians, just like we demand rigorous standards for our teachers. They need to get their methodology up to at least a C+ before we lend them any credibility and influence on salary and job security.

Error margin thresholds are simple, obviously fair, and are hard for anyone to argue against. Can we make this part of the "value-added" accountability discussion?


Scale the impact of the data by its error margin
This one is simple: the less accurate the data, the less weight it carries in my performance evaluation. I think this should work in conjunction with the above error threshold.



Only use 3-year running averages
Short anecdote: I was working on creative problem solving skills with my seniors and gave them this exact scenario -- tying teacher job security and pay to students' test scores -- and they had to produce recommendations. Teachers will be glad to know that most of my students thought this linkage to test scores was, "the dumbest idea, ever." At the end one student asked me, "they're not really doing this, are they?"

But one group came up with a nice solution: average the value-added calculations for each teacher over a three year period. Though the resulting data is still flawed, this would at least partially address the wild year-to-year variations inherent in the value-added approach. It would also give new teachers a little breathing room to get their feet under them as their first data-based evaluation would not occur until after their third year on the job.


If CPS and other districts across the country added these three compromises to the statistical portion of the teacher evaluation process, I think we'd see a lot less frustration and resistance from teachers. And of the three, I think the error margin threshold is a must-have. Ridiculously inaccurate data has no place in a professional evaluation, whatsoever.

And note that none of this speaks to the broader problem with this sort of evaluation system: namely the increased classroom focus on standardized test results. In my opinion, standardized tests largely measure a baseline level of competence in a subject. Certainly you need the skills that are tested to succeed in school, but they do not test the much more important valuable outcomes like critical thinking, brainstorming, debate, innovation.

We are choosing to make education all about optimizing standardized test scores. The 21st-century demands an economy based on innovation, based on the value of our brains and not our brawn. I fail to see how getting good at filling in bubbles with number two pencils will keep us relevant and dominant in the modern world.

If we're not focusing on critical thinking, we're not preparing our kids for the future.