Strongly agree with East here. I think that consistency is needed amongst all teams, because sometimes that point could make a difference in team standings, so I think that it is incredibly important to be consistent with either option, breaking ties or not breaking ties.EastStroudsburg13 wrote: ↑Tue Jan 12, 2021 10:07 amIf you simplify down to this, you get my position. In my opinion, you can either not break ties and have spare medals just in case, or you break all ties. I really don't like the idea of treating medalists any differently than all of the other teams competing in the event.knightmoves wrote: ↑Mon Jan 11, 2021 7:55 pmBut unless events are going to have spare medals on hand (so you can award 2nd place to 2 teams) you have to break the tieEastStroudsburg13 wrote: ↑Mon Jan 11, 2021 6:00 pm I guess it's technically a choice to break the 3/4 tie and not break the 15/16 one. I just think it's a really bad one.
Musings on Test Length
- 
				BennyTheJett  
- Exalted Member 
- Posts: 454
- Joined: Thu Feb 21, 2019 2:05 pm
- Division: Grad
- Pronouns: He/Him/His
- Has thanked: 95 times
- Been thanked: 276 times
Re: Musings on Test Length
Menomonie '21 UW-Platteville '25
Division D and proud. If you want a Geology tutor hmu.
						Division D and proud. If you want a Geology tutor hmu.
- 
				knightmoves
- Member 
- Posts: 589
- Joined: Thu Apr 26, 2018 6:40 pm
- Has thanked: 4 times
- Been thanked: 102 times
Re: Musings on Test Length
Another thought on test length - what is the effect of Monkey noise?
Monkey noise is what I call the effect of random guessing (as by an infinite number of monkeys) on multiple choice tests. I imagine most people make sure to spend the last 30s of the test filling in answers for the questions they didn't get to (whether you do all C, or random, or whatever). So on average, if you have N left-over questions at the end of the test, you expect to score N/5 (assuming 5 answers on the multiple choice), with sigma = sqrt (4N/25). So if your very-long test has 100 extra multiple choice questions that teams guess at, you're adding random Monkey noise of +/- 4 points to the score. Which means that if two teams score within about 4 points of each other, you can't really say which one did better.
If the test isn't multiple choice, monkey noise isn't an issue, because nobody is likely to randomly guess the right answer.
			
			
									
						Monkey noise is what I call the effect of random guessing (as by an infinite number of monkeys) on multiple choice tests. I imagine most people make sure to spend the last 30s of the test filling in answers for the questions they didn't get to (whether you do all C, or random, or whatever). So on average, if you have N left-over questions at the end of the test, you expect to score N/5 (assuming 5 answers on the multiple choice), with sigma = sqrt (4N/25). So if your very-long test has 100 extra multiple choice questions that teams guess at, you're adding random Monkey noise of +/- 4 points to the score. Which means that if two teams score within about 4 points of each other, you can't really say which one did better.
If the test isn't multiple choice, monkey noise isn't an issue, because nobody is likely to randomly guess the right answer.
- These users thanked the author knightmoves for the post (total 2):
- Mr.Epithelium (Thu Jan 14, 2021 3:26 pm) • sneepity (Fri Jan 15, 2021 9:06 am)
- 
				Unome  
- Moderator 
- Posts: 4319
- Joined: Sun Jan 26, 2014 12:48 pm
- Division: Grad
- State: GA
- Has thanked: 223 times
- Been thanked: 82 times
Re: Musings on Test Length
That's part of the reason why I phased out of writing multiple choice for the most part on my tests (that and multiple choice takes an absurd amount of time to write).knightmoves wrote: ↑Thu Jan 14, 2021 11:42 am Another thought on test length - what is the effect of Monkey noise?
Monkey noise is what I call the effect of random guessing (as by an infinite number of monkeys) on multiple choice tests. I imagine most people make sure to spend the last 30s of the test filling in answers for the questions they didn't get to (whether you do all C, or random, or whatever). So on average, if you have N left-over questions at the end of the test, you expect to score N/5 (assuming 5 answers on the multiple choice), with sigma = sqrt (4N/25). So if your very-long test has 100 extra multiple choice questions that teams guess at, you're adding random Monkey noise of +/- 4 points to the score. Which means that if two teams score within about 4 points of each other, you can't really say which one did better.
If the test isn't multiple choice, monkey noise isn't an issue, because nobody is likely to randomly guess the right answer.
- 
				BennyTheJett  
- Exalted Member 
- Posts: 454
- Joined: Thu Feb 21, 2019 2:05 pm
- Division: Grad
- Pronouns: He/Him/His
- Has thanked: 95 times
- Been thanked: 276 times
Re: Musings on Test Length
I just never wrote MC to begin withUnome wrote: ↑Fri Jan 15, 2021 12:18 pmThat's part of the reason why I phased out of writing multiple choice for the most part on my tests (that and multiple choice takes an absurd amount of time to write).knightmoves wrote: ↑Thu Jan 14, 2021 11:42 am Another thought on test length - what is the effect of Monkey noise?
Monkey noise is what I call the effect of random guessing (as by an infinite number of monkeys) on multiple choice tests. I imagine most people make sure to spend the last 30s of the test filling in answers for the questions they didn't get to (whether you do all C, or random, or whatever). So on average, if you have N left-over questions at the end of the test, you expect to score N/5 (assuming 5 answers on the multiple choice), with sigma = sqrt (4N/25). So if your very-long test has 100 extra multiple choice questions that teams guess at, you're adding random Monkey noise of +/- 4 points to the score. Which means that if two teams score within about 4 points of each other, you can't really say which one did better.
If the test isn't multiple choice, monkey noise isn't an issue, because nobody is likely to randomly guess the right answer.
 . If I need something like that, I've just written Fill in the Blanks.
 . If I need something like that, I've just written Fill in the Blanks.Menomonie '21 UW-Platteville '25
Division D and proud. If you want a Geology tutor hmu.
						Division D and proud. If you want a Geology tutor hmu.
- 
				knightmoves
- Member 
- Posts: 589
- Joined: Thu Apr 26, 2018 6:40 pm
- Has thanked: 4 times
- Been thanked: 102 times
Re: Musings on Test Length
I am told that scilympiad encourages multiple choice questions (by auto-grading them, but not successfully auto-grading any other kind of question). It seems as though I've seen more multiple choice this year than normal.
In a paper competition, multiple choice has the advantage of being gradable by non-experts, whereas even fill-in-the-blank questions often have answers with reasonable synonyms. My preference are multiple step calculation type questions, but those are basically impossible to mark by people who aren't subject experts.
- 
				PM2017  
- Member 
- Posts: 524
- Joined: Fri Jan 20, 2017 5:02 pm
- Division: Grad
- State: CA
- Has thanked: 23 times
- Been thanked: 13 times
Re: Musings on Test Length
One solution to the MC section would be to implement a random guess penalty? so that on average, random guessing yields a 0 score? This has its own issues though.
I think regardless that MCQs definitely have a place in scioly exams (I tend to make my tests 20-30% MC, and a smaller percentage when you weigh the point values). I think especially for casual teams, MCQs are a lot more encouraging than other forms of questions. I say this, despite absolutely despising writing MCQs, and being ambivalent to actually doing MCQs on tests.
			
			
									
						I think regardless that MCQs definitely have a place in scioly exams (I tend to make my tests 20-30% MC, and a smaller percentage when you weigh the point values). I think especially for casual teams, MCQs are a lot more encouraging than other forms of questions. I say this, despite absolutely despising writing MCQs, and being ambivalent to actually doing MCQs on tests.
I'm pretty confident you will almost never see a test where there are 100 MCQs that people will randomly choose. And, if you only have maybe 20 MCQs on your exam, I think other random factors that we can not control will be a bigger influence here. (The biggest being the choice of the specific subject matter on the exam. I know the counter-argument is to simply prepare for anything, but this is a) unrealistic for casual teams and b) still open to random chance, because it is almost certain that a competitor will be equally comfortable with each subtopic -- the exception being 0% familiarity lol).knightmoves wrote: ↑Thu Jan 14, 2021 11:42 am So if your very-long test has 100 extra multiple choice questions that teams guess at, you're adding random Monkey noise of +/- 4 points to the score. Which means that if two teams score within about 4 points of each other, you can't really say which one did better.
West High '19
UC Berkeley '23
Go Bears!
						UC Berkeley '23
Go Bears!
- 
				knightmoves
- Member 
- Posts: 589
- Joined: Thu Apr 26, 2018 6:40 pm
- Has thanked: 4 times
- Been thanked: 102 times
Re: Musings on Test Length
That doesn't reduce the noise. There are tests that do this (so you expect a monkey to score 0, rather than N/5). Typically you score 4 for a correct answer and -1 for a wrong one, but you're just scaling the binomial distribution by a factor of 4 and offsetting it - you don't reduce the width of the distribution.
You can introduce a harsher guess penalty (so you expect monkeys to get a negative score) to persuade people not to guess, which would reduce the noise because people wouldn't guess, but I was fairly sure I'd found somewhere that negative scores was against SO policy.
In some of the "very long test" discussions, we were getting close to that. And I agree that there's an element of luck in whether the ES chooses to test topics that you're good at, or less good at, that's a different kind of random. If you scored well because the questions were on your pet topics, you really did do well on that test. If you scored well because you threw seven sixes in a row at the end of the test, you were the beneficiary of pure random chance.
- 
				BennyTheJett  
- Exalted Member 
- Posts: 454
- Joined: Thu Feb 21, 2019 2:05 pm
- Division: Grad
- Pronouns: He/Him/His
- Has thanked: 95 times
- Been thanked: 276 times
Re: Musings on Test Length
tHiS Is WhY wE nEeD VeNn DiAgRaMsknightmoves wrote: ↑Fri Jan 15, 2021 5:54 pmThat doesn't reduce the noise. There are tests that do this (so you expect a monkey to score 0, rather than N/5). Typically you score 4 for a correct answer and -1 for a wrong one, but you're just scaling the binomial distribution by a factor of 4 and offsetting it - you don't reduce the width of the distribution.
You can introduce a harsher guess penalty (so you expect monkeys to get a negative score) to persuade people not to guess, which would reduce the noise because people wouldn't guess, but I was fairly sure I'd found somewhere that negative scores was against SO policy.
In some of the "very long test" discussions, we were getting close to that. And I agree that there's an element of luck in whether the ES chooses to test topics that you're good at, or less good at, that's a different kind of random. If you scored well because the questions were on your pet topics, you really did do well on that test. If you scored well because you threw seven sixes in a row at the end of the test, you were the beneficiary of pure random chance.
Menomonie '21 UW-Platteville '25
Division D and proud. If you want a Geology tutor hmu.
						Division D and proud. If you want a Geology tutor hmu.
 
	