Tales from the Testing Cycle: On Testing Methodology.

Everyone’s got their own particular approach to the process of list creation and list testing. Thinking about it today I realised mine is (shock! surprise!) heavily influenced by how I do research. I have a strong thread of Radical Behaviourism* in my research past, and as such, I’m a little distrustful of theory. I’m a data guy.


I indulge in Theorymachine all the time. It’s one of the joys of the hobby, after all. But when we’re talking tournament prep (and I usually am), I focus entirely on what works. In research and Warmachine both, I’m a functional contextualist – Truth is not absolute. Truth is simply what works, in context. The goal of science (and list testing) is predicting and influencing outcomes. In testing lists for tournaments, your desired outcome is Winning All The Games. The point of data is to inform your efforts to do just that.

This philosophy leads me to my core principles of list building/testing:

1) Be clear about your goals and gameplan.

The long term goal of The Testing Cycle is to produce a list (or lists) that will win you all of your games at a tournament. The short term goal, however, is to produce actionable information with regard to your long term goal. Knowing a list “doesn’t quite work” is slightly useful information, but what you really want is specific information to guide your decisions with regard to what changes to make to the list and the way your play it. Also important in this phase: figuring out which matchups your list needs to take.

2) Start simple.

By “simple” I mean “have a plan, and execute that plan in as straightforward a fashion as possible”. Especially in the early stages of testing, every component of a list should contribute directly towards your primary gameplan. Put another way, in the early stages, your primary gameplan should be the only gameplan (planning to jam in and win on scenario with eDeneghra? Bring double raiders and go all in on it).

Will this win all of you games? Very rarely. That’s not the point of this phase of testing. If you want good, clean, actionable data, then you need to start with fewer possible confounds. Want to find out if Mockery of Life restoring Vengeance units is a viable gameplan? Bring all of the Bane Knights. If you lose all your games, this tells you that that plan alone isn’t the way to make Goreshade3 sing. If you win games due to overwhelming attrition advantage and the attacks generated from Vengeance? Then you’re onto something. Most likely, you’ll win some and lose some, which allows you to identify contexts in which that plan is weaker, and begin to plan list changes based on that.

3) Don’t lean on theory. Lean on data.

This approach will mean you start with very one dimensional lists. This is intentional. You can often look at a one dimensional list and see immediately what problems it will have (see file: Gunlines crossref: scenario) based on theory. However, theoretical assumptions can be blinkering. If you’ve got the time to test, however (I know we don’t all always have that, and that’s where solid theory comes in), then it’s preferable to actually test it and get data. See where that one dimensional list runs into problems in the wild. Exactly how those weaknesses play out on the table can be enlightening (and can give you interesting and informative lessons in playing around weaknesses).

4) Change one variable at a time.

Then, you take your one dimensional, extreme build list and tack towards a more balanced build. Introduce some elements that support different win conditions, or that shore up the weaknesses of the gameplan you’ve focused on.

An important thing here is to change in one direction at a time. If you bring in two changes at once, then you’ve muddied your data as to why your results are changing. Slow and steady wins the testing race.

5) Replicate.

Replicate your results. Five is a good number for each matchup.

6) Peer Review

Talk your games over with your opponents. What had them worried or frustrated? What weren’t they worried about? What gameplan did they adopt on seeing your list, and how does that impact your own gameplan?

Know Yourself, and Go In Swinging

I_Avian

Advertisements

3 thoughts on “Tales from the Testing Cycle: On Testing Methodology.

  1. I agree with the theory above, but I’d be interested in what you do in practice.
    I approach with a similar theory, but I find that generally I have to try to extract a lot more data from one or two games than is really ‘there’ in order to get sufficient testing in.

    What I mean is, rather than playing a matchup 5 times, I really get to play a similar matchup a couple of times, and try to extrapolate from the games played into meaningful theory.
    If you combine the ‘not enough games’ problem with the numerous confounding things that can happen during any game (who go to to first, what the terrain was like, which scenario you played, how good the opponent was)……I find that really most of my ‘testing’ is a melange of theorymachine, extrapolation of results and extrapolation of “how the game might have played out differently”

    So in that world (which I suspect is the ‘real’ world for the majority of players), if I want to try to get three lists ready for masters in a month and a half (which I do), using a faction that i’ve only been playing for a month or two (which I am)….then what should I do?

    I only know one person who plays anywhere near enough games to get the 5 games at every stage of the testing cycle against the specific things he wants to test against.

    Admittedly, he does perform very well at tournaments lol.

    • Admittedly, the above methodology is difficult to execute in practice. It’s why I went into “Masters Mode” long before the actual event! I generally manage 3-4 games a week on the table, and try to get one or two on Vassal on top of that (though I’ve been too busy for Vassal recently)

      Most of the time you end up doing just what you describe and applying some level of theory and extrapolation. I’ll probably write about that next week.

  2. I end up doing something similar to Colin (with less success, based on pretty compelling evidence). I only manage 2-3 games a week on average, though I go to a lot fewer large events than some so I have more time to test. It does end up being a variation on this method, though rather than aiming for specific matchups, I aim for broad categories (and a lot of theorymachine).

    I suppose if I knew a good alternative method of testing and practice than “play all the games,” I’d do better in tournaments. 🙂 Looking forward to more of your thoughts on the topic!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s