By taking the absolute value of the difference of the team placement and event placement, and averaging each event's difference for all x amount of teams for that event, you obtain 23 values for the 23 events.
Example: Troy HS gets 10th in Anat, 3rd in Boomi, and gets 1st team, and Cumberland Valley gets 20th Anat, 1st in Boomi, and 2nd team.
|10-1| (Troy's Anat) and |20-2| (CV's Anat) would be averaged together (and with more teams' scores for this event, if more teams were present)
|3-1| (Troy's Boomi) and |1-2| (CV's Boomi) would be averaged together (and with more teams...)
And this would be repeated for all 23 events.
You end up with a set of values similar to this:
Anatomy 6.9844
Astronomy 8.4375
Boomi 9.5000
Chem Lab 9.1094
Circuit 8.1875
Code 6.4531
Designer 8.9219
Detector 11.3281
Disease 8.2344
Dynamic 8.1563
Exdesi 7.3281
Forensics 8.1250
Fossils 7.9375
Geomapping 8.5625
Gravity 12.6250
Machines 8.5625
Orni 6.9063
PPP 11.3125
Protein 8.0625
Sounds 9.2656
Water 6.0625
Wright 10.4375
WIDI 11.2031
This shows the correlation between team placement and event placement - it also shows how well the test is written, or how well the build is run.
(Shown above are values for MIT).
Obviously, values such as WIDI and builds are high - these are events people treat as a joke, or events that rely on heavy RNG. Events such as Anat, however, have a larger correlation to team score, where on average, the difference between your Anat place and Team place is only approximately 7.
You could divide these values by the amount of teams present (in this case, 66), and obtain percentages so you can compare these values to other competition's values.
Go through and average all the Anat, Astro, Boomi... values across all competitions, and obtain an average and standard deviation.
Data Sheet: https://docs.google.com/spreadsheets/d/ ... sp=sharing
I have highlighted in red the events that were more than one standard deviation higher than the average, and highlighted in green the events that were run the best out of the 9 competitions analyzed.
I chose the 9 competitions, as I wanted to compile and parse the data for all the 50+ team competitions. However, the mentor spreadsheet was in a poor format, and Wisconsin has people dropping events (they drop 5). Fairfax had a great deal of scoring issues so I dropped that as well.
Takeaways from the data (https://docs.google.com/spreadsheets/d/ ... sp=sharing):
Early competitions tend to have lower correlations. Obviously, this can be attributed to new events and a lack of time to prepare.
Harvard and Palatine either had bad teams or bad tests. At Harvard, 42 out of the 59 teams which attended no showed at least one event, so the poor values can be blamed towards not so great teams. Palatine was similar - 28/56 (half) of the teams no showed at least one event.
MIT tests are very, very high quality. Many of them had an extremely low average differences between team placement and event placement, even without considering that it is one of the largest invitationals in the nation, with 66 teams. Even in builds - such as boomi, low number could be attributed to a well run event and setup (@bernard). As often as people argue if GGSO and MIT are better competitions, it's clear that MIT is a lot better teamwise and testwise. (East coast beast coast) /s
By no means is this an accurate representation of invitational quality - this just gives a basic idea of how well written invitational tests are, and how good the teams are.
Ranked in the 2nd tab is a sheet named "All Events", which ranks the 23 events based on the averages across the 9 selected competitions.
As expected, PPP, WIDI, Gravity, Wright, Boomi, and Detector are all at the bottom, with a significant gap with the events above those. Pure studies, such as Dynamic, are all clumped together at the top. It's easy to spot who's tests are good (I wonder who wrote Fossils for Solon?)

Thoughts?