Two Score Sheet

drumcat · July 17, 2006

For a long time now, I've wondered whether a two score sheet makes sense. Now I'm pretty sure I can say that I'm disturbed by it. Why? Because it doesn't accomplish what it sets out to do. It sets out to separate the technical from the etherial.

Why do I believe it fails? I've been watching scores for a long time, but I couldn't put my finger on it. Now it seems clear. Judges do not separate the two portions of the sheets.

That's a pretty sweeping statement, huh. Well, I thought about it, then I examined the numbers from the D1 recaps this weekend. This includes Centerville, Denver, and Battle Creek. When you pull up the recaps, you'll find that the numbers are generally "as expected". But if you analyze them differently, you'll notice a striking pattern. At least to me, it's striking.

Each judge gets A + B = C for a score. There were 26 judges this weekend in those shows. Of all the scores, I compared when A was greater than or equal to B, or B was greater than or equal to A. When you go top to bottom, you'll find a trend. 14 out of 26 judges were either A>=B or B>=A for every corps. Several others deviated only once. EVERY corps. So for 4 judges in Battle Creek, they put down scores for twelve corps that night, and all twelve received the same A>=B or vice versa.

My question is why? Statistically, you might say that they aren't related to one another, but this correlation shows otherwise. You might say apples and oranges, but at that point, if it were true, the halves are disproportionately weighted. In other words, if you have lower technical scores, it would seem that the judge would therefore be applying a higher standard to technical, and disproportionately weighting the other side...

Either way, it shows me one thing. It shows me that many judges cannot separate two categories. Notice that both GE categories in all three shows had every "Rep" over every "Show". This represents 25 different shows. 25 different drum corps went out in those three shows, and all 25 had a higher "Rep" in visual and all 25 had a higher "Rep" in Music GE. That's 50 GE scores. 50/50. No exceptions.

The current system seems problematic. I don't have time to dig further, but it scares me. I did check, and the last time a D1 corps had a higher Show than Rep was 11 July; Spirit went 7.8/7.9. It's hard to see how this type of overwhelming bias isn't showing a flaw of sorts. CAVEAT!! Do I think judges are doing a poor job? No, not at all. But I do think that this shows a systemic problem.

The essential problem is that the GE category, at 40% of the total points, isn't being subcategorized effectively.

Thoughts?

Morgoth Bauglir · July 17, 2006

Well someone is going to point out that "dirty drill for instance, makes the design seem weaker" So the 2 subcaptions are related.

But I think you're right. Judges should try to stick to the sheets a lot more. The should be less cross-captioning and more focus on the actual demands of the sheets.

CSGreen · July 17, 2006

AGREED ON GE!

GE scores aren't averaged anywhere. The number that is given in both captions goes directly into the final score, giving those judges direct score weight (unlike all performance and Ensemble captions that are averaged

Storkysr · July 17, 2006

Judges should try to stick to the sheets a lot more. The should be less cross-captioning and more focus on the actual demands of the sheets.

Sure seems that way. Distinctions, while often co-dependant, are many times non existant in end result.

drumcat · July 17, 2006

Well someone is going to point out that "dirty drill for instance, makes the design seem weaker" So the 2 subcaptions are related.
But I think you're right. Judges should try to stick to the sheets a lot more. The should be less cross-captioning and more focus on the actual demands of the sheets.

Thus, every corps was like that? To me, the problem is then saying that "dirty drill" in this example, was dirty for *every* corps. I guess what I'm saying is that a GE judge gets 10% for this, and 10% for that... but they end up assigning numbers that are suggestive of one another.

In other words, based on the behavior of the scores, you might conclude that it's possible that judges have a habit of scoring one subcategory, and then relating the next one to it... as in.... OK, in A you're an 8.5, and I think B you're about 4/10ths behind your potential, so B you're an 8.1, giving 16.6. Or the reverse, like you're executing at 8.1, but you showed 0.4 better pizazz, so 8.5, 16.6.

That supposition is not conclusive; the postulate is only supported by the numbers. It's not proof, nor is it conspiracy. However, the pattern shows that if this behavior is in fact the case, and that the subcategory scores are related in such a way, it means that the box criteria are not independent. It shows that judges relate the B score to the A score in such a way that a dependency exists. And if one score is dependent on another score, the system doesn't work as designed.

Slow Adam · July 17, 2006

Either way, it shows me one thing. It shows me that many judges cannot separate two categories. Notice that both GE categories in all three shows had every "Rep" over every "Show". This represents 25 different shows. 25 different drum corps went out in those three shows, and all 25 had a higher "Rep" in visual and all 25 had a higher "Rep" in Music GE. That's 50 GE scores. 50/50. No exceptions.

Well considering "Rep" evaluates what they are being asked to do, and "Show" evaluates how well they are doing it, you have to expect things to shake out that way....unless you have a poorly designed / non challenging show being performed very well.

Storkysr · July 17, 2006

Quote from Drumcat: That supposition is not conclusive; the postulate is only supported by the numbers. It's not proof, nor is it conspiracy. However, the pattern shows that if this behavior is in fact the case, and that the subcategory scores are related in such a way, it means that the box criteria are not independent. It shows that judges relate the B score to the A score in such a way that a dependency exists. And if one score is dependent on another score, the system doesn't work as designed.

well said!

Edited July 17, 2006 by Storkysr

boilerman_05 · July 17, 2006

Well considering "Rep" evaluates what they are being asked to do, and "Show" evaluates how well they are doing it, you have to expect things to shake out that way....unless you have a poorly designed / non challenging show being performed very well.

Exactly. With my band, if I see the showmanship (well, the ISSMA equivalent) numbers higher than the rep numbers before the last week of the season, I get worried. It's a sign that the show is being "maxed out."

drumcat · July 17, 2006

Exactly. With my band, if I see the showmanship (well, the ISSMA equivalent) numbers higher than the rep numbers before the last week of the season, I get worried. It's a sign that the show is being "maxed out."

I'm not sure I understand... does this mean that if one number is higher than another, it indicates total potential? In other words, it sounds like you have a repotoire score, basically a judgment of how much a judge likes the show the director picked, and then the other number is assigned based on some gauge of how well the kids are performing based on the rep?

The problem is that if the director picks a turd, and the kids "polish" the turd, they should be rewarded accordingly... what you're saying is that all shows are limited in the showmanship category based on repotoire. It seems counterintuitive, since showmanship should not be a complete corrolary.

For example, a group should be able to go out and play the holy #### out of a Celine Dion show. The director should still be tarred and feathered, but that shouldn't affect the potential of the showmanship score.

Right?

Kamarag · July 17, 2006

For example, a group should be able to go out and play the holy #### out of a Celine Dion show. The director should still be tarred and feathered, but that shouldn't affect the potential of the showmanship score.

Not quite.

Don't think of it as "the show the director picked", but more like, "what the performers are being asked to perform". Theme itself doesn't much matter, except in the GE captions. Even there, it's not so much what the theme actually is, but if and how it is being conveyed.

Sign In

Two Score Sheet

Recommended Posts

drumcat

Link to comment

Share on other sites

Morgoth Bauglir

Link to comment

Share on other sites

CSGreen

Link to comment

Share on other sites

Storkysr

Link to comment

Share on other sites

drumcat

Link to comment

Share on other sites

Slow Adam

Link to comment

Share on other sites

Storkysr

Link to comment

Share on other sites

boilerman_05

Link to comment

Share on other sites

drumcat

Link to comment

Share on other sites

Kamarag

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Browse

Activity