Evaluate: The Summative Assessment Quest

posted May 24, 2016, 5:51 AM by Emily Kroutil
For this blog entry, I am going to use a paper test I created for my students as an example.  I will just show the first page of the test, so as not to give away all the questions in case other physics teachers in my school are using similar questions/exams.  I will also explain how I would adjust the administration of the exam in order to ensure validity, reliability, and security in an online environment.

This particular test is from our Electricity unit.  I would have liked to use my projectiles unit for continuity, but I was on maternity leave during that unit this year and I'd like to use a unit that I taught both years for when we look at the data.  The following is the first page from the electricity test:
The first thing you should notice is that each question is tied to a particular learning objective in that unit.  This ensures that the students are being tested over the material they have learned.  If a question does not specifically relate to a learning objective from that unit, it is thrown out.  If the question seems to cover an important concept in that unit, but isn't tied to a learning objective, then I know I need to go back and adjust my learning objectives.  This is one way I ensure that my test is valid.  I also look closely at the test and make sure I don't have any unintended cultural biases in my test.  For example, my first year teaching, a physical science test had a question that basically asked, Why would motorists put sandbags in the trunk of their car when driving on icy roads?  For someone who lives in a place where it ices, they would know the answer.  However, my students did not know the answer because they'd lived in the South their whole lives.  They were getting the question wrong, not because they didn't understand the concept being tested but because the question had an unintended cultural bias.  When teaching online school, I would make sure to check the assessment questions against my the learning objectives for my course and make sure that they were aligned.

I don't often give the same test to the same student twice, so it is difficult for me to check reliability that way.  However, over the past two years, I've given this test to over 200 students.  When I look at a histogram of scores from both years I get the following:

These scores are very similar, only 2% change in average score.  One thing that is noticeable, however, is that the second year, there were more very low scores and more very high scores.  I can only guess that the very low scores were from students that knew it was the end of the year and that this was the last test before the final.  I let the students' final exam grade replace any test grade that quarter, so there are those students that blow off the last test because they have so much going on with AP exams and other state-mandated tests, and then buckle down and study for the final exam, knowing the final will replace this test grade.  I also told the students that I would not make them take the final if their year grade was over 95% as a thank you for working hard all year.  Some students were very close to this average and studied very hard for the last test trying to bring up their year grade enough so they didn't have to take the final.  Even with the lower grades, the average score was a little higher this year (electricity 2015 clone).  I'd like to say this was because I improved as a teacher, but it probably wasn't that at all. :/

When teaching online classes, I would look at similar data to make sure that my tests were reliable.  If the average scores were drastically different from year to year, that could suggest that my test is not reliable.  Of course, I would need to be teaching the course long enough to be able to have some of that data.  I could also look at the histograms and compare sections with each other.  If the sections have drastically different scores, this could also indicate a lack of reliability.

When administering a test in my classroom, I work hard at maintaining test security.  Tests are kept in a locked cabinet, students are only allowed to work on test corrections in my classroom under supervision, and I try to watch my students closely to make sure they are each doing their own work.  In the online classroom, I would make a time limit for my test, not allow students to look over their score and choices until everyone had completed the test, if at all, and try to have a large test bank to choose questions from.  I especially like software that lets you pick the questions and it scrambles the questions and gives the students a subset of the questions, so each student has a different question set.  Getting to know your students and their work is also crucial.  For example, when I was in college, my mom wanted me to write an essay for my brother who was in high school (no judgement, please).  I told her that I couldn't write the essay for my brother because he had been writing his essays all year.  His teacher would know his style, common grammar mistakes, etc.  If I wrote an essay and he submitted it as his, the teacher would know that it wasn't written by my brother because it would be completely different stylistically.  In that same way, I get to know my students and can tell if they have simply improved a great deal or something else is going on.  
Comments