The Zarfian Cruelty Scale, Revisited

The Zarfian Cruelty Scale is 23 years old as of last month. That’s not a numerically interesting anniversary, but it’s respectable age for a offhand scrap of critical theory that still gets mentioned regularly.

I am pleased and amused by the Cruelty Scale’s continuing currency. But I also worry that people might apply it more broadly or rigidly than it deserves. The Cruelty Scale has had caveats almost from the beginning; it embodies many 1996-ish assumptions about IF and game design. I think it’s time to, at minimum, dig into those assumptions. We should make sure we know what we’re thinking when we use it.


What is this thing?

In 1996, when releasing So Far, I included a comment in the ABOUT text:

This is a shortish-to-medium-sized game. More than a short story, less than a full-length novel. Its rating is “cruel.” It is possible to make mistakes which will prevent you from winning. Sometimes common sense will serve to avoid such mistakes; sometimes insight is necessary; sometimes neither will help. Save often, and keep your old saved games.

(Note the unsubtle nudge that IF should be considered on par with prose fiction! I thought this was a wagon worth hauling, back then.)

I must have had a set of ratings laid out somewhere, but I don’t think I tried to publicize them until someone mentioned the comment on rec.games.int-fiction. I then laid out the scheme:

My clarification, a few days later:

This all looks simple, but the edges are curiously hard to pin down. Even the ordering can be questioned. Is Tough (“it’s obvious that you’re stuck”) fundamentally different from Nasty (“it’s obvious when you’ve become stuck”)? I hope so. I intended to distinguish two mindsets: saving to checkpoint concrete progress (Polite), and saving because you’re worried you’re about to lose progress (Tough). And then Nasty is the additional strain where you always have to worry about losing progress, because the game won’t clue you in that trouble is ahead.

My aim was to refine the notion of “losable” versus “unlosable” adventure games. Those terms were proposed in 1994 by Mike Threepoint; the concept had been floating around for years. Myst was a well-recognized example of a game designed to be “unlosable”. (At least until its endgame! But let’s not get distracted.)

The ambiguity of “losable” is between a game you can lose and a game in which you can fail to win. Losing is only one kind of not-winning: the explicit, you-have-died, game-over kind. One can also not-win by failing to solve puzzles, or by being stuck – locked out of victory by past actions. The Cruelty Scale doggedly refuses to concern itself with puzzle difficulty. (A weighty subject which we will not tackle today.) Being locked out of victory, in contrast, is a central concern. It is the essence of the Cruel rating. Tough and Nasty are lock-outs mitigated by, respectively, player foresight and player recovery.

What does the thing mean?

The labels have a slant. Merciful, Polite, and Tough games sound “fair”, in some sense. Nasty and Cruel, in contrast, attach moral censure. I didn’t mean this judgementally, of course. (Nearly all of my games were Cruel until The Dreamhold in 2004!) The moral bent is intended to correspond to that sense of player frustration. “I should have saved, but I didn’t, and now the game is punishing me by forcing me to replay this part.” This is not the only way games can be “unfair”, but it does capture one sense of unfairness that players feel.

From its beginning, the Cruelty Scale was about the player’s responsibility to manage game state. The second list above makes this explicit. As with any contract, one party’s responsibility is the other party’s freedom, and vice versa. If the player is responsible for saving before trying anything clearly dangerous, then the author is free to create clearly dangerous regions that wreck the player’s goals. If the author is responsible for making those regions clearly dangerous, then the player is free to explore tamer regions without bothering to save all the time.

I said nothing about interpreter capabilities. However, we must read the scale with an eye to how parser IF interpreters worked. I phrased everything in terms of save files. This already imports a host of unspoken assumptions! Saving the game requires a conscious decision; the player can keep many save states; the player has to remember where each state is and what game actions it represents. The implicit argument of the Cruelty Scale is this is work – tedious work, distracting from the fun of gameplay.

The tedium has improved over the years, mind you. In the “modern” IF era (1996 counts), each save state is a relatively small named file on your hard drive. You can keep an effectively unlimited number; you might keep track with an ad-hoc naming scheme. However, my “golden age” of IF was the 8-bit era. On the Apple 2, one needed a stock of formatted floppies to save games on. (At a cost of a dollar or so each!) A floppy held three to eight saves, depending on the game’s complexity. And woe your lot if you had only one disk drive and thus had to swap disks every time you saved.

The flip side is the cost of replaying if you failed to keep a necessary save file. That, too, has improved over the years. Modern interpreters can keep the entire game in RAM, thus avoiding the disk lag of the 8-bit era. Modern conveniences such as command history and copy-paste permit speedier typing. But then we’ve all gotten older and more impatient. The motivation to avoid replaying remains tangible.

Would you like to restart, restore, or undo?

The original posts did not mention the UNDO command at all. This is a surprising omission from today’s perspective. In 1996, I was using (and indeed supporting) Z-machine interpreters with undo, but it must not have seemed meaningfully different from a save file.

It certainly feels different when we think about our assumptions, though, doesn’t it? You only had (have) a few undo slots available; they are filled automatically; you can undo with a single command, without having make any selection or decision beyond “let’s back up.” The uncomfortable user experience which the Cruelty Scale implicitly bemoans is nearly erased.

Given this, it is very tempting to recast the scale in terms of undo: categorizing games where you can use undo instead of save files.

Even I am prone to remember the Cruelty Scale this way if I haven’t revisited the original wording! However, this phrasing changes the intent in subtle ways. For one thing, the distinction between Tough and Nasty becomes blurred. (Undo requires no intentional save, so you don’t need to judge situations in advance for possible danger.)

Further, the question of move count enters the picture. The original scale did not mention move count, except in the definition of Polite, which presumed a single move between “you can still win” and “you are dead”. But now we must be concerned with how many moves you need to back up to reverse your errors.

This is complicated by the unevenness – and opacity – of undo support across different games and interpreters. Until 2004, games written in Inform 6 only supported single-turn undo. This limitation was enforced by the Inform 6 library (through release 6/10). The distinction between Polite and Tough/Nasty was therefore very tangible to players!

With the 6/11 release of Inform 6, and then Inform 7, games allowed as much undo as the interpreter offered. Nearly all interpreters offered several turns of undo, and authors and players generally assumed that at least one turn was guaranteed. But in fact this was not true. When Counterfeit Monkey was initially released, some players found that undo was entirely unavailable. This was due to an interpreter which put a hard limit on the amount of memory used by undo records. CM was so complex that even a single undo record exceeded the limit, so the interpreter kept zero records. (Interpreter patches were hastily released.)

Other interpreters put a fixed limit on the number of undo records kept. This was simpler for users to understand, but it nevertheless led to its own brand of confusion. In the collaboratively-authored game Cragne Manor, part of the authoring guidelines asked that unwinnable states end the game “within the UNDO window (eight turns)”. This was due to the misconception that one interpreter’s eight-turn limit was a universal rule. Some authors created puzzles gauged to this guideline, and that led to problems in the editing stage.

The Cruelty Scale has to bear some responsibility for these events. The Cragne guidelines were, in effect, trying to conform to a notional rating of “Neo-Polite: modern interpreters guarantee more than single-turn undo, so let’s loosen the definition.” A sensible idea, if the interpreter landscape were able to reliably support it.

Beyond failure

To proceed with our interrogation, we must break down the idea of “losing”.

Parser IF in 1996 was still firmly centered on achieving victory. There might be many unsuccessful (losing) endings, or none, but the gameplay was struggling to reach the successful (winning) ending. We imagined and happily discussed games with several winning endings, or games which had no puzzles to contend with. But these were variations on a basic structure which remained unquestioned.

As Twine and other choice-based game formats became prominent in the IF world, they kicked the feet out from under this structure. This wasn’t because of any explicit hostility to win/lose conditions. (The classic CYOA book series, with the simplest kind of branching structure, mostly offered clear death-or-victory endpoints.) Rather, the very question of what do I want from this game? had shifted.

In a archetypical branching narrative, every choice is crucial and irrevocable – in the current situation. It may not map to an explicit victory or defeat, but the player will likely have some sense of preferring or dispreferring the outcome, in contrast with other, unseen outcomes. All those other potentials are then gone forever. Should we label this a Tough game structure? It is an understood convention (thus expected by the player) of irrevocable choices (thus the risk of losing, if you consider any outcome to be defeat). Or perhaps your goal is to see every path; this is a common assumption in visual novels. If so, then every choice is an irrevocable failure by definition!

Alternatively, we may accept the endings – the endpoints of the branches – as defeats and victories on their own terms. But then we quickly discover games in which there are no defeats. The convention of Choice of Games stories is that every choice moves forward and every outcome is a satisfying endpoint. The player never loses or becomes stuck. Should we consider this Merciful by definition? But, on the other hand, these games allow each player to decide their own goals. This too is a convention of the form: a broad range of outcomes to suit every taste. In that sense, there are victories and defeats, different for every player. Each outcome stems from the accumulation of many choices with limited transparency, so the game must be considered Cruel.

Thus the Cruelty Scale, designed for the parser world, maps awkwardly onto choice-based games. It may be interesting to consider this difference in terms of low-level choice structure. A parser IF session is mostly comprised of reversible or stateless actions: LOOK, EXAMINE, picking up inanimate items, moving around an unobstructed area. State-changing actions are exceptional and significant. Therefore, for an action to affect state unexpectedly is a jarring moment.

Choice of Games, in contrast, presumes that every action affects the game state. The player expects this as a matter of course. Some Twine games adopt this posture; others offer a distinction between observational choices (analogous to EXAMINE) and active choices (changing state). The discussion of whether or how to convey this distinction is ongoing; it is clearly related to the Cruelty Scale’s concerns, but cannot be directly described in those terms.

The applications of death

The simple message “you have died” has expanded, over the years, into undiscovered countries. What does it mean when death and failure become part of the game itself?

In the earliest days of IF, players complained about “learning by death”. This referred to a puzzle which could only reasonably solved by trial-and-error, and in which error meant death. Imagine two identical doors, one of which leads to a fatal drop. A slightly subtler form would be a one-shot magic item which can be tried in several locations, with no clue which is correct. An error here is not fatal, but leaves you stuck in the Nasty or Cruel sense.

Players wanted to believe that a game could be solved in a single session, without needing to restart or undo, if one were preternaturally clever. Consider every clue caught, every implication understood, every puzzle analyzed on sight from observation and first principles. This is an impossible ideal, of course, but it captures the fine line between “I should have known” and “I couldn’t have known.” If a game does not observe this distinction, then failure becomes a necessary part of the play experience. Does this change how we interpret the Cruelty Scale? The answer has traditionally been no, but one can make an argument that no necessary step towards victory is truly defeat.

The unlosable game structure rules out this complaint entirely. But it also rules out many interesting puzzles. Enchanter contains a powerful one-shot magic scroll which can be applied to solve any of several puzzles. All but one of these puzzles has an alternate solution which does not require the scroll. One must explore many combinations before managing to solve every puzzle without wasting the scroll. This is a beloved game, but – as discussed in the original Cruelty Scale thread – it is unquestionably Cruel; it requires a great deal of learning-through-failure.

If an author wants to explore these puzzles without losing (as it were) their audience, they might begin to make affordances for failure. It was not just text adventures which offered an UNDO command at death. Many graphical games put a “try again” button on their artisanally-gory death screen. (Thus making death more of a reward than a punishment – which is fine.)

Visual novels conventionally invite the player to experience every variation of play. They therefore may offer affordances to speed up the repetition of replay. Fast-forward controls allow the player to jump through familiar dialogue sequences to the next unexplored choice. A more ornate feature might be an interactive story chart that enables you to jump around freely.

Time and again

The next step is to invent a narrative excuse for the “try again” cycle, thus bringing death diegetically into the story. Both Adventure and Zork were willing to resurrect the player upon death, albeit with a score penalty (which might be considered worse than dying). Death was, in any case, only the simplest failure on order. To truly manage all the ways to lose a losable game, one must invent a narrative explanation for reversing death as well as all the player’s mistakes. Inevitably, or at least inevitably after 1993, one is drawn to the time-loop story.

Time loop games are a rich vein, from classics such as Majora’s Mask to the recent Outer Wilds and Elsinore. I essayed it myself in Spider and Web, albeit in the guise of revisiting memories rather than time travel. In all of these games, the looping mechanic is more than a convenience. Having invented such an idea, the author must give it its due and construct the entire game around it. The player will explore many variations of the storyline, comparing and contrasting, to understand the whole. By wrapping all failures in a diegetic storyline, these games mostly adhere to the spirit of the Merciful rating.

(It is interesting to note that you can die outside the loop of Outer Wilds. Such deaths are mechanically identical to in-game loop resets. The only difference is that you must push a “try again” button from a black screen instead of watching the evocative flashback effect. This ambivalence cuts against the narrative basis of the game, but it is strong enough to survive.)

The loop reset mechanic may be augmented with some notion of progression – typically the protagonist’s knowledge or skills, carried across loops. These elements naturally mirror the player’s own understanding in that they are additive. You only ever gain skills, not lose them, so the Merciful structure is not imperiled.

Since the time-loop model asks the player to run through the game repeatedly, it inclines towards “bushy” game structures. The player’s reward-vs-slog ratio is highest if each run-through is relatively short and the variations cover a broad range of possibilities. But this can be improved further if the game offers shortcuts. Like the visual novel affordances mentioned earlier, these offset the inevitable repetition – but diegetically within the story. This was my intent in Hadean Lands, a time-loop game which automates every puzzle-solving action once you discover and enact it. The GO TO command allows you to return to any area after a loop reset, narrating the actions needed to get there. I thus allow the player to explore deeper branching structures without undue tedium.

In conclusion

IF sure has gotten complicated, hasn’t it?

No, let me rephrase that.

The moment of the late 1970s which crystallized parser IF – Adventure, Zork, and their cousins – has lasted a startlingly long time. Infocom’s conception of the parser UI, world model, and game style was effectively extinct by 1990. But those of us who felt its impact then spent years recreating and codifying it.

In 1996, in the rising arc of that age, the Infocom model seemed like an eternal verity. It had been present for my entire aware life, had it not? We were finding ways to use the form for new genres and new kinds of storytelling, but we did not have the breadth to question all of its assumptions.

Within those assumptions, it seemed natural to write down our perceived categories as laws of IF nature. That was my vision of the Cruelty Scale: natural categories which transcended the accidents of the form.

The form still exists. Forty years on, it is recognizable and viable. Myself-age–12 could have sat down in front of Hadean Lands and started playing without a blink. This continuity of form is unmatched within the field of videogames. It is astonishing. It is also extremely weird. To treat it as the normal course of game evolution is foolishly short-sighted.

The continued currency of the Cruelty Scale is equally astonishing; but to rely on it unquestioningly is equally foolish. As we have seen, the ground had already shifted under its assumptions from the 80s to the 90s. That ground has continued to shift, through the revolutions of Twine and choice-based games, to quality-based and storylet forms, and onward.

The concerns of the Scale – fairness of clueing, minimizing the frustrations of play – are of course still with us. The forms in which they are expressed are an unceasing ferment, a negotation of expectation between author and audience. Remember to watch the storm rather than the raindrop.


Updated September 4, 2019.

Game Rambles (and others)

Zarfhome (map)