How well is an event run? - An Analysis

Sciolyperson1's Userpage · Post by **sciolyperson1** » Mon Apr 13, 2020 7:28 pm

I decided to look through the correlation between team placement and event placement.

By taking the absolute value of the difference of the team placement and event placement, and averaging each event's difference for all x amount of teams for that event, you obtain 23 values for the 23 events.

Example: Troy HS gets 10th in Anat, 3rd in Boomi, and gets 1st team, and Cumberland Valley gets 20th Anat, 1st in Boomi, and 2nd team.
|10-1| (Troy's Anat) and |20-2| (CV's Anat) would be averaged together (and with more teams' scores for this event, if more teams were present)
|3-1| (Troy's Boomi) and |1-2| (CV's Boomi) would be averaged together (and with more teams...)
And this would be repeated for all 23 events.
You end up with a set of values similar to this:

Anatomy 6.9844
Astronomy 8.4375
Boomi 9.5000
Chem Lab 9.1094
Circuit 8.1875
Code 6.4531
Designer 8.9219
Detector 11.3281
Disease 8.2344
Dynamic 8.1563
Exdesi 7.3281
Forensics 8.1250
Fossils 7.9375
Geomapping 8.5625
Gravity 12.6250
Machines 8.5625
Orni 6.9063
PPP 11.3125
Protein 8.0625
Sounds 9.2656
Water 6.0625
Wright 10.4375
WIDI 11.2031

This shows the correlation between team placement and event placement - it also shows how well the test is written, or how well the build is run.
(Shown above are values for MIT).

Obviously, values such as WIDI and builds are high - these are events people treat as a joke, or events that rely on heavy RNG. Events such as Anat, however, have a larger correlation to team score, where on average, the difference between your Anat place and Team place is only approximately 7.

You could divide these values by the amount of teams present (in this case, 66), and obtain percentages so you can compare these values to other competition's values.

Go through and average all the Anat, Astro, Boomi... values across all competitions, and obtain an average and standard deviation.

Data Sheet: https://docs.google.com/spreadsheets/d/ ... sp=sharing

I have highlighted in red the events that were more than one standard deviation higher than the average, and highlighted in green the events that were run the best out of the 9 competitions analyzed.

I chose the 9 competitions, as I wanted to compile and parse the data for all the 50+ team competitions. However, the mentor spreadsheet was in a poor format, and Wisconsin has people dropping events (they drop 5). Fairfax had a great deal of scoring issues so I dropped that as well.

Takeaways from the data (https://docs.google.com/spreadsheets/d/ ... sp=sharing):

Early competitions tend to have lower correlations. Obviously, this can be attributed to new events and a lack of time to prepare.

Harvard and Palatine either had bad teams or bad tests. At Harvard, 42 out of the 59 teams which attended no showed at least one event, so the poor values can be blamed towards not so great teams. Palatine was similar - 28/56 (half) of the teams no showed at least one event.

MIT tests are very, very high quality. Many of them had an extremely low average differences between team placement and event placement, even without considering that it is one of the largest invitationals in the nation, with 66 teams. Even in builds - such as boomi, low number could be attributed to a well run event and setup (@bernard). As often as people argue if GGSO and MIT are better competitions, it's clear that MIT is a lot better teamwise and testwise. (East coast beast coast) /s

By no means is this an accurate representation of invitational quality - this just gives a basic idea of how well written invitational tests are, and how good the teams are.

Ranked in the 2nd tab is a sheet named "All Events", which ranks the 23 events based on the averages across the 9 selected competitions.

As expected, PPP, WIDI, Gravity, Wright, Boomi, and Detector are all at the bottom, with a significant gap with the events above those. Pure studies, such as Dynamic, are all clumped together at the top. It's easy to spot who's tests are good (I wonder who wrote Fossils for Solon?)

, but also easy to spot if an event was run terribly (Harvard's Gravity Brick Floor).

Thoughts?

Userpage · Post by **Unome** » Mon Apr 13, 2020 8:42 pm

It's worth noting that while this statistic does measure the correlation, a stronger correlation does not inherently imply better event quality.

Post by **SilverBreeze** » Mon Apr 13, 2020 10:41 pm

Hmm... I think this might be better for determining whether a few events might have had issues than for determining overall invitational quality, as this treats the overall team placement as what that team "should have" gotten at that event. Of course, there's no way to determine that, so overall placement is the next best option. Two issues I might see with this are overall poorly run invitationals and outliers. I haven't actually messed with the data, so I don't know how much a high-placing team bombing an event or a low-placing team doing really well in one event would skew the values, but with overall more poorly run invitationals the overall placements might be further from the "true" placement than expected, making it more difficult to identify problem events.

Adding on to what Unome said, one case I think where correlation does not imply quality would be if all events were poorly run in the same way. I can't think of something off the top of my head as poorly-run events tend to be more reliant on luck, but this system is a measure of consistency and extending that as an estimate for quality.

Overall, I think it's a great way to get a rough idea of invitational/event quality when you only have access to scores and not, say, competitor and coach opinions.

Scioly.org

How well is an event run? - An Analysis

How well is an event run? - An Analysis

Re: How well is an event run? - An Analysis

Re: How well is an event run? - An Analysis

Connect

Learn

Get Involved

About

Disclaimer