Saturday, January 19, 2008

Constants on the left are better, but this is often trumped by a preference for English word order.

Typically we all write comparison statements like this:

But, the following is just as valid:

The bottom style is a bit safer in languages like C because if you forget to put a double equals sign

the compiler will assign 5 to the "currentValue" variable and the result will be the value of the assignment, which is 5. Anything that isn't zero is "truthy" and will cause the "if" branch to be taken. If you didn't intend for this and you're lucky enough to have compiler warnings turned all the way up, you'll get a helpful message like "warning C4706: assignment within conditional expression."

Note that the bottom expression can never have this problem. The number 5 is constant and can not be assigned to, so you get a compile time error:

Therefore, getting in the habit of putting constants on the left hand side would have prevented the possibility the unintended assignment class of error.

Much less well known is that this type of thinking is also helpful when an "=" sign isn't present at all. Consider this function:


If you're a developer, you'll likely run into a lot of code just like this. Maybe the code you see won't check for bad input like they should, so you'll occasionally get a "NullReferenceException" which makes life no fun.

An astute observer would realze the code could be written:

where the check for null is eliminated altogether and the static version of Equals is called. Since Equals does a null check internally, it'd be superfluous to do it twice.

This is usually where most people stop. Note that we could take advantage of constants on the left and do this:



the result is that you save around 10 characters and still never throw a "NullReferenceException." This takes advantage of the fact that string literals are simply string objects themselves.

Again, putting the constant on the left eliminated the possibility of forgetting about nulls. But lets be honest, I don't do this in production code and you probably don't either.

Why not?

Well, it just feels wrong. Go ahead and look at any textbook you had in college or even the latest programming books and look at their code samples. While some of the more pragmatic ones for embedded systems might recommend putting integer constants on the left hand side, you'll almost never see the string example. I've only seen it one book myself.

But again, why?

I think the reason comes down to the fact that English sentences are almost always subject-verb-object where you say the subject noun before the object noun. When we write code, we probably unconsciously think something along the lines of "if this (subject) thingy (is equal to) this (object) thingy then do such and such." Just as saying "The program is what I wrote" is far and away weirder than saying "I wrote the program," putting constants on the left feels weird and I don't do it because I want my code to be as easy to read by others as possible.

Having subject-verb-object sentences is not the only way to express yourself. In the "Koine" Greek language that was spoken by most of the known world 2000 years ago, the subject and object could come in any order. The order you chose just let the reader know what you wanted to emphasize. The verb and nouns have endings on them (inflections/case) to let you know what each word means. Likewise, in Japanese you almost always do subject-object-verb. Word order is just one feature of a language, not some universal standard.

It makes me wonder how things like the Sapir-Whorf hypothesis, which states that the language you think in can affect how you understand things and what type of thoughts you can have, might apply to the types of problems programmers face. However, I mean it in the opposite sense of the types of things your programming language lets you do. It's more of "what are the bad things that can be caused by how you do think."

In the end, I love the simplicity and terseness of putting constants on the left. However, the fact that English is the native language of almost everyone who looks at my code, and the overwhelming precedent of style has been to put constants on the right, I have no real choice but to put them on the right.

But I can't but wonder what would have happened had the Fortran or C specs been conceived in Greek or Japanese.

12 comments:

Anonymous said...

interesting idea!

I wonder if this already falls under the category "too clever".

Jeff Moser said...

I thought the "constant".Equals(...) trick was clever when I read it. I wonder how weird the approach feels to people who read right to left (e.g. Hebrew natives)

Brian Lakstins said...

It really seems that something that is this simple and this right should just be done. I speak English, and I don't have a hard time following the code this way.

I also don't know how many times I've debugged the "if (this = 5)" type of code when I meant "if (this == 5)".

Although, instead of "5", it would probably be some kind of enum, which I don't think can be assigned to either so it would work the same.

Jeff Moser said...

Brian,

I wish it was more common practice. I think that if it had been like this from the early days, it would have saved our industry countless hours of grief like you mentioned.

Unfortunately, it's almost too entrenched in our way of thinking. At least C# makes it harder to do the first (value = constant) in a place where a boolean is expected. Unfortunately, it offers no help in the string comparison example. Hopefully Spec#'s ideas will creep into C# and help eliminate those.

However, you could be a revolutionary and use it in your own code :)

Anonymous said...

its seems kinda python to do it that way.
many of the examples ive studies while learning python the past year have been just like this. for example
" ".join(someString)

Jeff Moser said...

Interesting. I should look at more Python code.

Brian Di Croce said...

Beautiful post. Thanks for sharing the history lesson on the Koite-Greek language, I have never heard about it before.

In NUnit, whenever you want to assert an expectation, the first parameter is always the "expected" value, then the "actual" value, i.e., Assert.AreEqual(expectedValue, actualValue). I think the same logic should be apply in the boolean conditions (if expectedValue == actualValue)...

But even though I say that, I myself prefer to write the if (actualValue == expectedValue) for the only reason that I'm used to it. But that's not a good reason! Even though changing a programming style is hard, it is not impossible. And like you said, it sure minimizes the possiblities of runtime errors or bugs in the long run.

Jeff Moser said...

Brian: Good point about NUnit's AreEqual(x,y) order. I never made the connection.

I use Visual Studio 2008's built in unit testing and they copied NUnit's precedent (from JUnit probably). It seems that that particular order has now become defacto for assertions.

There is some resemblance to the order of the "source" and "destination" variables in the standard C string libraries (where destination comes first).

In that case, it doesn't make as much sense to me why the subject (source) comes after the object (destination).

Anonymous said...

Of course, string.Equals(s1,s2,a3) does not throw a NullPointerException whereas string.Equals(s1, a3) does...

knucKles said...

This is a very interesting fact but I think it is not very helpfull for improved developers. I wrote constants to the right my whole time as a programmer. If I start to watch my fingers not to do so I can also watch my fingers not to forget the second '='. But maybe this idea would be nice in new books for greenhorns. Schoolbooks like "Starting programming in C++" or so. They could start programming this good style.

Whatever... this is a point of view I never had and so it is a very interesting post at all. Never the less the fact with the string is usefull! :)

Mat Josher said...

Do you know of a lint tool that will enforce this?

Jeff Moser said...

knucKles: Regarding strings on the left. I sometimes do that now to save a few unnecessary characters :)

Mat Josher: None come to mind, but with the upcoming Rosyln series compilers, it shouldn't be hard to write one for C#. In languages like C++, at least high warning levels would help highlight some issues (as mentioned in this post).