samedi 17 janvier 2009

The selfish idiom

This is yet another post inspired by neodarwinism.

Code duplication happens all the time, no programmer doubts that. Because duplicating a piece of code is easy as hell and initially very cheap you quickly get yourself into some kind of a living mess. Big code bases do have a life of their own and the forces driving it are most interesting.

What makes a code fragment successful? It's like our own genes, it has to be reproduced at a high rate without copying errors. When that happens the replicating idiom slowly invades the code base. A growing number of similar duplicates increases the probabilities that one of them will be noticed by a programmer and copy-pasted once more ; so there is a positive feedback loop. Even if all programmers will tell you it's bad practice to help a piece of code multiply, when a specific instance of an idiom is all over the place it becomes standard practice. Give it a few more months, and it's a local 'best practice'.

The fragments of code that get copy-pasted are over-represented in our code bases... that says almost nothing about them. After all, they don't really replicate by themselves, they need our help. All these little code samples must have some kind of mind controlling power to force us to do their bidding. Successful code knows how to exploit our pride and laziness. I think that one of the super-powers self-duplicating code have is incomprehensibility.

survival of the misunderstood

That was the original title of the post. Control-C control-V is the standard prayer of our cargo cults. It's also the fertility dance that our code gene overlords make us do on those last-night hacking sessions. We get ourselves in a caffeinated transe and dance the dance of life!

Actually, the whole process is celebrated in a long mass. First we have the high priest creating the original code, with the original sin hidden inside of it. Then it's presented as an offering to a less experienced follower, that will a little while later pass the good news to colleague. It's very important that the followers don't quite understand what is going on inside the code, or else they might be tempted to factor it as a function and put an end to an evolution branch. The use of global state, counter-intuitive interfaces and weird 3rd-party libraries are all on the code's side.

how to kill it

What do we do now? It's very tempting to simply say "writer simpler code" ; but that would be very wrong. Of course you should not write code that's needlessly complicated, but stating the obvious never helps. My point is that there is very often a true need for the complexity we put in code. The complexity in good code simply represents the true complexity of the underlying problem and helps prevent code duplication. Otherwise you get simplexity.

There are two players in this game : the code and the coders. If we can't simply make the code simple, maybe we can make the coders better. After all, this high priest that created the original sin probably understands a good deal of it. He should work more at explaining his creations to his followers, and put a little less time in changing the colors of buttons, documenting, or whatever the boss of his boss thinks is valuable, billable hours.

Aucun commentaire: