syo_astro wrote:
I'll preface this with saying that I didn't take the test, so I'm not saying strictly "you're wrong" or aiming my issues directly at you (or whoever wrote the test, sorry I'm not sure >.>), but I see some healthy argument based on what you've said. Point by point:
-Just because teams don't finish the nationals exam doesn't mean teams at an invite should have difficulty with finishing. They're different tournaments. Say an invite is a practice for a regional or state competition for some (even many) teams. Does that mean they should get floored with a national level exam? It definitely wouldn't make the exam useless (I don't think / hope nobody thinks that), but it definitely limits the test's usefulness.
-Just because a test is long doesn't mean it helps separate scores. Again, not trying to knock on you specifically, I'm sure you put in a variety of question difficulties and such. But some teams definitely get floored if they see a bunch of difficult questions about stuff they've never seen. Let me put it this way: if students don't even answer questions, then it's as if you can't even evaluate them on that question, which is part of why length doesn't really go with separating scores as much as "long enough" (accounting for question difficulty of course). Science Olympiad test taking is also different from regular test taking because you have a partner, so I don't understand that as a relevant goal to shoot for...I always saw it more like tackling a big project and learning to split the work, it's definitely non-standard.
-So this is an interesting point and one I've been thinking about more, recently (looking at tests from the "learning side" and not just as "diagnostics")! I have read that difficult tests do help learning (that's right, I learn about learning, I'm a nerd:P), but I think it depends on A LOT of factors. Some of the main reasons I've read why is that students have to prepare more broadly for the tests. I've also read that it's important to "make mistakes and learn from them". Usually, students are supposed to give a serious effort on a question or a test first and then learn from those mistakes. But I'm not sure how that applies if you can't even try the questions in the first place. This part is definitely an opinion, but I don't think getting students to make mistakes needs to be taken TOO far or even done all the time to aid in their learning. It's quite the balancing act, it's been interesting to read about!
As for the last point, no comment, good to hear. Anyway, probably a different thread for this, apologies for the interruption.
Edit: One extra point I see a lot based on the last post is that tests should involve challenges or critical analysis similar to doing "real science". While I agree that would be great, practically it can be difficult to execute that on every question, and I have yet to read about how those questions actually do on the diagnostic / learning side of things (I'll get to it soon enough:P). It can sometimes even be difficult to really see how a question on a test relates to...well, "real science". And again, there's a diversity of teams competing anyway, and some don't compete as intensely...just stuff to keep in mind.
[/quote][/quote]
Sorry to quote so much...
The SOUP disease exam was not incredibly difficult, and was actually filled with typical reg-state level material, and only some national level things... especially given that in disease, the big difference between nationals and states is the amount of common-sense understanding you need to have of public health in general. I know you weren't directing comments at this exam, but let me defend it because I appreciated the test. For the record I agree with a lot of what you said in general. However, your comments were relatively abstract, which makes them hard to use on specific tests... I'm trying to break them down a bit in this post.
There are four reasons that I believe people go to invites. (1) For fun,(2) for practice taking an exam in realistic settings, (3) for the content of the practice exam itself, and (4) for evaluation of where you are. I think the first one is largely independent of actual tests. The disease test at SOUP was certainly good practice at taking a realistic exam; it was a mix of case study and general knowledge, and the length made test-taking strategies important. I agree with you though that if a test is super hard, and 'floors' competitors, it's not so realistic/useful for regionals/states. But much of the test was not so difficult. I think that if you moved at the speed that's expected of someone who wasn't able to answer the hardest questions, you should have had no problem finding enough easy questions to fill the exam period, with basic test-taking strategies.
Moving onto (3), as you said, there's a lot to be said for an exam you can take home and learn from. The knowledge needed for the test was not super difficult, except for a 3-5 scattered questions worth a small fraction of total points. It's a little specific to the event of disease detectives, but most hard points don't come from obscure knowledge, but from conceptual questions. Tests like this one asked a lot of conceptual questions, that you learn how to answer easily only by being in similar situations many times before. For example (not specifically on this test, but questions like this come up on regionals, states, and nats, and SOUP tests), "What measures would you first take to stop spread of outbreak X in this restaurant?" This is the 'hard' part of disease; that's why practice for the event emphasizes so much practice as opposed to textbook knowledge.
That raises the question of evaluation. Is it fair to evaluate all teams on a test so difficult that they don't finish? I agree that saying 'you won't finish nationals' is not a fair argument, because many teams aren't planning to go to nationals (although many are, and SOUP has some disgression... they used nationals rules for Mousetrap, which I did not agree with at all because most vehicles weren't prepared). My defense of the length though is centered on the idea of fluency. The event is designed to evaluate how proficient competitors are at really understanding public health and its challenges and approaches. It's a test of your fluency with the concepts at hand. Behavioral studies tend to demonstrate that fluent behavior is not only accurate, but it's also fast. Someone is fluent in public health knowledge if they take 5 seconds to recognize that mosquito nets should be used to prevent malaria. If they take 5 minutes to come to the same conclusion, then they probably haven't thought about the situation before seeing it on this test, and a lot of public health really is about pattern recognition; fluency. While a team moving slower will get points for their accuracy, I believe that an ideal test should be just barely finishable by a top national team, because that way, not all the really good teams will finish, and their fluency with the material can be compared. In my experience with the exam, an ideal set of partners could finish it... we came tantalizingly close

.
Best,
Sam
EDIT: I said nothing about the quality of the case studies themselves, which I think make or break the disease test; a good test has realistic, plausible outbreak investigations. This test was very good quality, although the first section maybe could have been done better with some of the ecological study questions (9/10 for content).