When I arrived at my current school a decade ago, there was no definitive measure of a student’s ability to read. That may be hard to conceive in this data-drives-instruction landscape (although in education you can find plenty of instances where a lot of data will not tell you the most basic things about students), but the feeling was that teachers knew the students and the word was spread as they moved from grade to grade.
In November of that first year, I found that I had a student who did not know alphabetical order. I had asked her to look something up in the dictionary; she faked it for a few minutes and then threw a fit that got us distracted from the whole enterprise. Fortunately, my aide noticed the faking and followed up. This student was a known non-reader, but no one knew she was reading at a second grade level. How, after all, could someone be reading at so low a level? And she carried around grade level appropriate books and sat quietly during SSR. When pressed, she created a scene of distraction. In trying to fit in, she slipped through the cracks. She is exactly why we have and use data today.
What is the DRP? (Skip Down for Discussion on Validity)
At a previous job, we had used the Degree of Reading Power, or DRP, on 10th graders. Created by Questar, the DRP measures reading comprehension. In the assessment a passage is provided with certain words removed. Students are asked to fill in the blank from a selection of five words. The 7th grade test is 63 questions, while the 10th grade was 110 when I gave it years ago (I doubt it’s changed). The questions start out easy, and get progressively harder as the student goes on.
We use bubble sheets–it’s that dull. But, in adopting the DRP, we have a screening tool that lets us question who has mastered the basics of reading. From that baseline, we ask follow-up questions, plus have students write reading journals and answer prompts to measure understanding and deeper meaning throughout the year. For the hour we put into it, we get what we need out of it.
The DRP also provides some good data. From the raw score, Questar gives you an Independent Reading Score, or I90. The I90 indicates the level of a book a student can independently read without problems or assistance. So, Harry Potter and the Sorcerer’s Stone is ranked a 56, meaning a student who scores an I90: 56 should be able to read it unassisted (this does not take into account cultural literacy or maturity, which is why caution should be used on Of Mice and Men’s “easier” score of 53). It also offeres I80 and I70 scores, which indicate increasing support offered for understanding, plus a “frustration level”–the point where a student might throw the textbook across the room. Questar also ranks a student’s score against the nation, providing national percentiles and stanines. I’ve never asked what database they get this information from (is it against other users of the DRP, or larger pools?), or if it updates every year, but it’s a larger sample size nonetheless.
When I first did this, there was a booklet filled with tables that converted all of this for you, but they later came out with a database for the computer. The company also provided a directory of popular classroom texts and their DRP, so you could match students with books. None of these CD ROMs ever really worked for our computers. Questar seemed locked in the 1970s. The online information today smells like a dying company or division being run out of habit, where each year someone has an idea to update stuff but never to revamp the entire test for the NCLB age. Even the name Questar sounds like one of the lesser computers of the early 80’s competing with Tandy and Commadore. I think they know what they have and keep plugging.
The neatest thing about the DRP, though, is that the I90 score measures across grades. You can compare an I90 score taken in 2nd grade with the I90 taken with the 7th grade test. So, if a 2nd grader scores a 43 as a 2nd grader and a 45 as a 7th grader, you know they have not progressed over five years of schooling. You can also give a poor reading 7th grader a 5th grade test and they will not meet their frustration point until much deeper in, providing a more accurate result. In the end, I like to measure growth. The DRP is great for that.
What Does the DRP Really Measure?
Every September we give the DRP to our 7th grade, and every May we give it again. Because of the design, we can measure I90 growth over the year. We can also measure it against their 6th grade result. If we use the 6th grade spring results against the 7th grade fall, we are be able to measure gain (or loss) over the summer. We can do the same thing when we measure the same kids in 8th grade.
But the DRP is dull. And, remember, the questions start out easy, and get progressively harder as the student goes on. For some students, the first hard question throws them. Then, they just color in dots. One way Questar makes money is that they sell their bubble sheets and then correct them for you (and put the results on a disk, ready for manipulation). Instead of paying for that, we took our answer key and made an overlay (overhead transparency sent through the photocopier). In correcting it ourselves we can see where kids give up from a series of wrong answers.
Which lead us to the question we’ve wondered for a while: Does the DRP measure reading comprehension or stamina?
To answer that question, last September I broke up the 63 question DRP I usually give my 7th graders into three parts with 21 questions each. Then, I measured growth (or not) with their 6th grade scores from the previous May. In the end, nothing significant showed up except that one group got better at reading over the summer: Over scheduled kids. I had read that high achieving middle class kids who participate in a lot of activities–soccer, music, the school play–cannot find time to read during the school year. My data showed that, but nothing about stamina. In fact, the ups and downs over the summer made little sense.
But discoveries often happen by accident. This May, I went back to the old administration of the test–63 questions in one sitting. Our 8th graders were taking a NCLB mandated Science assessment, so I used that time to give the 7th graders the DRP. Because the 8th grade were monopolizing our aides and classrooms, I set the 7th graders up in the cafeteria while the kitchen staff were whipping up lunch in the kitchen. My hope was that the blowers and bacon smell would be white noise and calming as the students and DRP assessments spread out across the antiseptic tables in the grey room. Some finished quickly, while others lingered over an hour.
The results were not inspiring. I had been unhappy with my reading program–I’m unhappy every year with both my reading and writing programs, but this year I now had weak data to prove it. I uploaded my scores into a spreadsheet, looked at growth, ranked and sorted. The high kids stayed high, and the middle kids stayed in the middle, with a few growing or dropping a bit. Even that assessment is a bit inflated, if I’m honest. It was not a good year.
Then there were the kids at the bottom. About ten students had dropped between ten and thirty points over the school year (on a scale topping out at 80). This was significant. Our entire Tier II intervention placement was based on these scores. Several students who had moved out of Tier II were looking at returning in 8th grade. Those receiving Tier II were seeing regression. What, I wondered, was I doing wrong? (I had ideas, and it started with sacrificing SSR time for any distraction that came down the pipe).
In looking at the names of the students, I realized that those students who either had a diagnosis of ADD or ADHD, or we suspected of having ADD or ADHD, had tanked. Our literacy group had often wondered to what extent the DRP was a test of stamina as much as it tested reading.
In looked at their answer sheets, I noticed that around the 20th question these students began to get questions wrong. Not just a few as the questions got harder, but a string wrong and then another string wrong. The break, I suspected, was because, even when guessing, a student will get some correct, because probability. They had given up and were just filling in bubbles. Bad data.
The next week, I had a few of these students redo questions 22 through 42. They were placed in a quiet room in two groups of four. I had explained my belief in them and personally appealed to their sense of pride and in controlling the environment. In short, was trying to get them to focus on the task and then setting up an environment that fostered focus. Six of the eight did significantly better, from 5 to 12 questions better. When I had them redo the last twenty questions, I saw the same results. Five the students went from the 4th or 5th stanine for reading nationally to the 8th.
Of the two students who showed little improvement, one is not ADD or ADHD. The other is suspected of ADHD and was even more hyperactive than the first administration and openly hostile to the retake. Either they had learned to read in a week, or I had been measuring stamina before.
Why does it matter beyond the one assessment? Our school uses the DRP data to decide who get Tier II help and who has “graduated” to Tier I. Tier II instruction happens against World Language, so it can be a reward or punishment depending on the family. There is some pressure for students to be taking a World Langauge (often from their parents), or a desire by students to “flunk” into Tier II so they can a) avoid the hard work of learning French and b) be with their Tier II friends. These numbers weigh heavily in the court of “what’s best for the child”.
It also matters on how we take other, higher stakes assessments. For their NCLB assessment, Vermont uses the SBAC. Entirely online, students have a lot of control–if they choose to take it. Those who click through quickly and take a long break find those answers locked when they return. They can, though, slowly go through a small number and break. Then return for a few more. This is different from past assessments, which means we need to retrain and empower students. These results tell us that we need to instruct some students in how to take a test–an instruction that in tailored to individual students and different from just attacking the questions themselves. The results also tell us we need to create a different environment–one in which students can move about without disturbing others, and they are less tied to a clock.
All of this leads me to a more outlandish proposition that I am still thinking about: Our school uses the DRP to measure where students are, but I’d like assessment to be more predictive about potential. Why? Because when an assessment just measures, I find the school’s reaction is to address what they think it measures. So, those who tank the DRP get put into the standard Tier II reading program. But if we can measure elements that go into that measure–like stamina–it gives us a better idea of what to address. The potential is there. The fix, then, might be more around Habits of Mind than more phonics. At present, we are not sure.
Of course, our support services responds by offering more assessments. But that is often guesswork and time consuming. If the cafeteria with bacon wafting through the room is not condusive to results, I cannot image the forty minutes a special educator can give me to do a “quick” BES is much better. And the coordinator who battered kids with an AIMS-Web in a noisy hallway (the only space available) produced little that was useful. And, if anything is found, the student is often dumped into a program with a promise that “we’ll work on that” cause when they have time after the reading instruction is done. No one has time. In identifying causes, we might find the solution can be had with greater efficiency.
My hope is assessments that can be more predictive, and can be done by empowering students. By having the students value the assessment, and understanding the consequences of their choices, they own it. When we give them the tools to do their best work, they use them. In the end, the measure becomes about reading.
Then we’ll have to find another test for stamina.