Goto
For this post, I’m going to be referring back to EWD215 (“A Case against the GO To Statement”, as Dijkstra called it, or “Goto Considered Harmful” as it is perhaps more commonly known).
Or… sort of. Most of what Dijkstra has to say in that post has been
dissected to death, and I don’t have a whole lot original to add. EWD215 came
about in the early days of programming languages: when we were always thinking
consciously about how our code would be compiled because our languages were
essentially (somewhat-)cross-platform assembly macros. I think his core point
is part of a larger idea that I endorse: just because you can make your
program structure reflect its assembly representation doesn’t mean you
should. The looping constructs - for
, while
, loop
, etc. - have direct
representations in assembly language, but this isn’t a good way to think about
how to use them. (That there are multiple such possible representations is a
further complication that we definitely don’t want to consider unless we are
writing a compiler.)
Rubin’s Reply
Far more interesting to me is Frank Rubin’s reply, which he gave the obnoxious title ‘“GOTO Considered Harmful” Considered Harmful’. I’m going to try to avoid mentioning tone here, which is difficult to avoid when Rubin insists on jabs like “the argument was academic and unconvincing”. (Entertainingly enough, that line turns out to have been entirely incorrect in hindsight.)
Rubin advances two ideas to support his position: first, that no study has
indicated that goto
is in fact “harmful”, and second, that there are
problems for which a solution that uses goto
is shorter and simpler.
This first argument is not only superfluous but actively damaging to Rubin’s position; it is in fact an instance of argumentum ex silentio. (This is an invalid move because Rubin is asserting that no studies have shown it to be harmful, which is a subtly but importantly different assertion than that studies have shown it to be non-harmful. To my knowledge there have been no studies conclusively showing either way, but that’s not important to this discussion). All replies to this essay I have seen ignore this particular argument entirely, which is probably for the best because EWD215 never uses the word “harmful” once, except in the ACM title which Dijkstra himself did not pick.
The second argument is more interesting to me, especially because I think it’s
a position I agree with in the general case. Unfortunately the presentation
of this article does Rubin no favors here, as the code samples presented are
chewed up by the three-column layout the ACM is using here. Essentially, he
presents a problem (given X, an N by N integer matrix, print the index of the
first nonzero row), followed by what he considers optimal goto
and
goto
-less solutions. This is of course all done in Pascal because
everything is terrible.
More on that in a moment.
Everyone wants to weigh in now
Way too many people decided they wanted in on this discussion at this point. A number of them were published in the ACM’s letters section. (If you can’t read that I’m really sorry; the US government killed the man who wanted you to be able to.) I’m going to pull out a couple of these replies and ignore the rest. I will also attempt to ignore the title (I assume the ACM put this here) of this unholy mess: “””“‘GOTO Considered Harmful’ Considered Harmful” Considered Harmful?”””.
Donald Moore’s letter emphasizes that Rubin’s goto
-less program is more
complex, and argues that “Rubin’s letter attempts to ‘prove’ that a GOTO
can simplify the program, but instead proves to me that his implementation
language is deficient” which is a line I really like. He’s not wrong, either;
in C, to solve the problem, I’d write something like:
for (int i = 0; i < n; i++) {
int j;
for (j = 0; j < n; j++) {
if (X[i][j] != 0) {
break;
}
}
if (j == n) {
printf("Row %d\n", i);
break;
}
}
and it’s immediately clear what’s missing. (Some types simplified for clarity; please ignore.)
Interestingly, what I’ve written there is in fact a slightly more modern version of Chuch Musciano’s response, not Moore’s. Musciano uses this code for something different though - he seems to take the view that this isn’t idiomatic Pascal, and that the best way to solve the problem is with a sentinel boolean. He doesn’t really articulate any of this, just asserts it, so it’s hard to tell.
Moore - and this is something that entertains me more than it should - is
instead arguing that Pascal is missing a loop constructor that resembles
Lisp’s. Basically, he wants to make statements like for i := 1 to n while j
<> 3
. So as a thought exercise, here’s a Lisp (SBCL; this one I had set up
already) solution:
(defvar printed-p nil)
(loop for row in X for i from 0 while (not printed-p) do
(if (eq 0 (reduce #'logior row))
(setf printed-p (print i))))
And if I make it actually readable (it’s technically not the same, but it’s really close) to people who didn’t write it:
(defvar printed-p nil)
(defun has_nonzero (row)
(eq 0 (reduce #'logior row)))
(loop for row in X
for i from 0
while (not printed-p) do
(if (has_nonzero row)
(progn (print i)
(setf printed-p T))))
It’s okay, I guess? The whole time I was writing that I had to resist a
serious urge to go full Haskell and zip [0..]
with a reduce
and so on.
There are a couple other objections, but I’m not going to go into them because I’d rather talk about literally anything else now.
Math, math everywhere
So Dijkstra replied again in EWD1009. And I wept tears of joy that the word “harmful” appears nowhere in his response.
The first half comes across to me as weirdly petty - Djikstra divides his response into -five- six points (normal people do not number normal things starting from zero please come on), the first four of which are nits on Rubin’s code (only two of which are actual problems for execution). And that pretty much sets the tone of the rest of this response.
In Rubin’s message, I mentioned he complained about Dijkstra’s original argument being “academic”. It’s entirely possible that I felt that criticism unfair because I have the hindsight of having read EWD1009. And boy howdy is this thing ivory towering over everything.
Point five (which Dijkstra helpfully calls “4”) is explaining that the
behavior of short-circuiting and
and or
is a problem. For some reason
this is formulated as an anti-Rubin position. I had to stare at Rubin’s code
for surprisingly long to even see what he was complaining about (while (j <=
n) and (x[i, j] = 0)
as it turns out) because this is actually a very
important common pattern in languages with NULLable values (i.e., if (p !=
NULL && *p ==3)
needs to not perform the dereference without checking that
the dereference is safe first).
The final point is borderline unreadable due to notation and condescension. I have no idea what kind of pseudocode Dijkstra is using here; I can make out what’s going on, but by not writing Pascal as all previous respondants have, Dijkstra is refusing to participate in the conversation. (Additionally, his pseudocode is longer than the most of solutions presented to begin with, and longer than all when converted into Pascal.)
There’s a great line at the end I wanted to highlight: “The whole correspondence was carried out at a level that vividly reminded me of the intellectual climate of twenty years ago”. It seems history is cyclic, and EWD1009 reminds me of the negative attitudes prevalent in computing today: elitism and hypocrisy.
Enough of that
Alright I’m tired of talking about angry Dijkstra now. I want to go back to
Rubin for a moment, and look specifically at his first example. Let’s ignore
that it uses a break
command in Pascal despite Pascal not having such a
statement.
It sets up a particularly disgusting construct that looks like this:
for ... do begin
if ... then
goto reject;
...
reject: end;
and I’d like to point out how horrible that is. We’ve got an empty label being used to point to the end of the block - or if you prefer, a block end keyword as the only contents of a label - with fall-through into the label.
The other thing I wanted to point out is that all of the programs presented
were concerned with achieving maximum performance. One can actually achieve a
much cleaner result if we discard that constraint. For example, in 10 lines,
without a goto
, we can do the following
written := 0
for i := 0 to n do begin
fnz := 0;
for j := 0 to n do
fnz := fnz | X[i, j];
if fnz = 0 and written = 0 then begin
writeln("Row: ", row);
written := 1
end;
end;
I’m not going to actually set up a Pascal environment, but I’m reasonably confident that even if it isn’t totally valid it is clear enough to convey my meaning. This is one line longer than Rubin’s third example (which he calls nine lines, despite it being clearly ten; is this the zero indexing Dijkstra loves so much?) and I much prefer it, though I really don’t like this language.
goto fail
I was going to make most of this post be a screed about how many programmers
are now afraid of the goto
construct and avoid it at all costs, but I really
don’t have much to say about that. Despite what Dijkstra might want, goto
is used to paper over accidental language design issues - lack of break
in
Pascal, etc..
In particular, one case it’s important to embrace the goto
is for resource
management in C. Consider code like this:
int myfunc() {
int *a = malloc(sizeof(int) * 4);
int res = subfunc(...);
if (res) {
printf("Oh no! An error!\n");
return -1;
}
/* more stuff */
free(a);
return 0;
}
This code actually has two problems. The first is that it doesn’t check the
return on malloc
, but I’ll get to that in a second. The other problem is
that it leaks the contents of a
if subfunc
fails. Let’s fix those both
real quick:
int myfunc() {
int *a = malloc(sizeof(int) * 4);
if (!a) {
printf("Malloc failure!\n");
return -1;
}
int res = subfunc(...);
if (res) {
printf("Oh no! An error!\n");
free(a);
return -1;
}
/* more stuff */
free(a);
return 0;
}
This very quickly gets out of hand because we end up having to free(a)
on
every exit path. We can simplify this problem by making there only be one
exit path to our function:
int myfunc() {
int *a = NULL;
int res = -1;
int *a = malloc(sizeof(int) * 4);
if (!a) {
printf("Malloc failure!\n");
goto done;
}
int res = subfunc(...);
if (res) {
printf("Oh no! An error!\n");
goto done;
}
/* more stuff */
res = 0;
done:
free(a);
return res;
}
which results in something much more manageable.
There are no absolutes.