Why I Love Reading Other People’s Code And You Should Too

19/05/2010 2212 words 11 min read

Hate It occurs to me, that many programmers hate reading code – c’mon admit it. Just about everyone loves writing code – writing code is fun. Reading code, on the other hand, is hard work. Not only is it hard work, it is boring, cause let’s face it, any code not written by you just sucks (oh we don’t say it, but we’re all thinking it). Even your own code begins to look increasingly sucky mere hours after you’ve finished writing it and the longer you leave it the suckier it seems. So, why should you waste hours looking at other people’s crappy code when you can be spending this time writing awesome code of your own? Weren’t we just over this; give it a couple of hours and then come back and see if your code still looks awesome. You’re never going to become a master of your craft if you don’t absorb the knowledge of the masters that came before you. One way to do that is to find a master in person and get them to teach you everything they know. Is this possible – certainly, is it probable – not so much, you’d have to be extremely lucky. You don’t really need luck though, we’re fortunate to be in a profession where the knowledge and skill of all the masters is right there for us to absorb, embedded in the code they have written. All you have to do is read it, sure it might take you a bit longer without someone sitting there explaining it to you, but it’s eminently possible. To put it in perspective, try becoming a great carpenter just by looking at a bunch of well constructed furniture.

I love reading code, I have always intuitively felt that you get a lot from it, yes it can be annoying and boring, but the payoff is well worth the effort. Consider this, if you wanted to become a great writer, would you focus exclusively on writing? You could try it, but you’re not going to get far. It is an accepted fact that most great writers are also voracious readers. Before you can hope to write anything decent you need to read other great writers, absorb different styles, see what others have tried before you and feed your creative self. Your knowledge will slowly grow and eventually, your own writing will begin to exhibit some maturity, you will develop a ‘feel’ for it. Coding is no different, why would you expect to write anything decent if you’ve never read any great code? The answer is you shouldn’t. Reading great code is just as important for a programmer as reading great books is for a writer (I can’t take credit for this thought, it belongs to Peter Norvig and he is awesome, so take heed).

Even if all of this is unconvincing, there is one fact which is undeniable. Being good at reading code is important to your survival as a professional developer. Any non-trivial project these days will be a team effort and so there will always be large chunks of code you had no hand in which you have to work with, modify and extend. And so, code reading will likely be the most used and most useful skill you can have; better bite the bullet and get good at it – fast.

How To Read Code Like … Some Kind Of Code-Reading Guy

I can’t tell you how many times I’ve seen programmers scroll up and down through an unfamiliar piece of code, for minutes on end, with a sour expression on their face. They would later declare the code to be unreadable crap and why waste the time anyway; we can just work around the problem somehow. I am not sure what the expectation was here, absorb the meaning of the code by osmosis, or maybe intensely stare your way to enlightenment? You don’t read code by just looking at it for ages, you want to understand it and make it your own. Here are some techniques I like to use, it is not an exhaustive list, but I have found these to be particularly helpful.

Stare

Try to build and run it. Often this is a simple one step process, like when you’re looking at work code (as opposed to random code). However this is not always the case and you can learn a lot about the high level structure of the code from getting it to build and execute. And, on the subject of work code, you are intimately familiar with how to build your current project are you not? Builds are often complex, but there is a lot of understanding to be gleaned from knowing how the build does its thing to produce the executable bits.
Don’t focus on the details just yet. The first thing you want to do is get a bit of a feel for the structure and style of the code you’re reading. Start having a browse and try to figure out what the various bits of the code are trying to do. This will familiarise you with the high level structure of the whole codebase as well as give you some idea of the kind of code you’re dealing with (well factored, spaghetti etc.). This is the time where you want to find the entry point (whatever it happens to be, main function, servlet, controller etc.) and see how the code branches out from there. Don’t spend too long on this; it’s a step you can come back to at any time as you gain more familiarity with the code.
Make sure you understand all the constructs. Unless you happen to be the premier expert on your programming language there are probably some things you didn’t know it could do. As you’re having a high level fly-through of the code, note down any constructs you may be unfamiliar with. If there are a lot of these, your next step is obvious. You’re not going to get very far if you have no idea what the code is doing syntactically. Even if there are only a few constructs you’re unfamiliar with, it is probably a good idea to go look them up. You’re now discovering things about your language you didn’t know before, I am happy to trade a few hours of code reading just for that.
Now that you’ve got a good idea about most of the constructs, it is time to do a couple of random deep-dives. Just like in step 2, start flying though the code, but this time, pick some random functions or classes and start looking through them line by line. This is where the hard work really begins, but also where you will start getting the major pay-offs. The idea here is to really get into the mindset (the groove) of the codebase you’re looking at. Once again don’t spend too much time on this, but do try and deeply absorb a few meaty chunks before moving on. This is another step to which you can come back again and again with a bit more context and get more out of it every time.
There were undoubtedly things in the previous step you were confused about, so this is the perfect time to go and read some tests. You will potentially have a lot less trouble following these and gain an understating of the code under test at the same time. I am constantly surprised when developers ignore a well-written and thorough test suite while trying to read and understand some code. Of course, sometimes there are no tests.
No tests you say, sounds like the perfect time to write some. There are many benefits here, you’re aiding your own understanding, you’re improving the codebase, you’re writing code while reading it, which is the best of both worlds and gives you something to do with your hands. Even if there are tests already, you can always write some more for your own benefit. Testing code often requires thinking about it a little differently and concepts that were eluding you before can become clear.
Extract curious bits of code into standalone programs. I find this to be a fun exercise when reading code, even if just for a change of pace. Even if you don’t understand the low level details of the code, you may have some idea of what the code is trying to do at a high level. Why not extract that particular bit of functionality into a separate program. It will make it easier to debug when you can execute a small chunk by itself, which – in turn – may allow you to take that extra step towards the understanding you’ve been looking for.
The code is dirty and smelly? Why not refactor it. I am not suggesting you rewrite the whole codebase, but refactoring even small portions of the code can really take your understanding to the next level. Start pulling out the functionality you do understand into self-contained functions. Before you know it, the original monster function is looking manageable and you can fit it in your head. Refactoring allows you to make the code your own without having to completely rewrite it. It helps to have good tests for this, but even if you don’t have that, just test as you go and only pull out functionality you’re sure of. Even if the tests seem totally inadequate – learn to trust your own skill as a developer, sometimes you just need to go for it (you can always revert if you have to).
If nothing seems to help, get yourself a code reading buddy. You’re probably not the only person out there who can benefit from reading this code, so go grab someone else and try reading it together. Don’t get an expert though, they’ll just explain it all to you at a high level and you will miss all the nuances that you can pick up by going though the code yourself. However, if nothing works, and you just don’t get it, sometimes the best thing you can do is ask. Ask your co-workers or if you’re reading open source code, try and find someone on the interwebs. But remember, this is the last step, not the first one.

If I was pressed for time and needed to understand some code reasonably quickly and could only pick one of the above steps, I would pick refactoring (step 8). You will not be able to get across quite as much, but the stuff you do get across you will know solid. Either way, the thing you need to keep in mind is this. If you’re new to a significant codebase, you will never get across it instantly, or even quickly. It will take days, weeks and months of patient effort – just accept it. Even having an expert right there with you doesn’t significantly cut the time down (this is what the last post in my teaching and learning series will be about). If however, you’re patient and methodical about your code reading (and writing) you can eventually become intimately familiar with all aspects of the project and become the go-to man when it comes to the codebase. Alternatively you can avoid reading code and always be the guy looking for someone to explain something to you. I know which one I would rather be.

Seek Out Code Reading Opportunities – Don’t Avoid Them

Code

We love to write new code, it’s seductive cause this time we’re going to get it just right. Ok, maybe not this time, but next time for sure. The truth is, you’re always evolving in your craft and you’re never going to get it just right. There is always value in writing new code, you get to practice and hone your skills, but there is often just as much (if not more) value in reading and playing with code written by others. You not only get some valuable tech knowledge from it, but often domain knowledge as well (after all, the code is the ultimate form of documentation) which is often more valuable still.

Even code that is written in an esoteric fashion, without following any kind of convention, can be valuable. You know the code I am talking about, it almost looks obfuscated, but it wasn’t intended to be (for some reason it is often Perl code :)). Whenever I see code like that, I think of it this way. Just imagine the kinds of stuff you will learn if you can only decipher this stuff. Yeah, it’s a major pain, but admit it, there is that niggling secret desire to be able to write such naturally obfuscated code yourself (no use denying it, you know it’s true). Well, if you invest some time into reading code like that, you’re much more likely to eventually be able to write it – doesn’t mean you will write it, but you want to be ABLE to. Lastly, attitude is always paramount. If you view code reading as a chore then it will be a chore and you will avoid it, but if you choose to see it as an opportunity – good things will happen.

Images by mafleen, Scott Ableman and LisaThumann

Closures – A Simple Explanation (Using Ruby)

17/05/2010 1948 words 10 min read

Closure There are a great many decent developers out there who don’t know what a closure is. I don’t really have any concrete stats on this matter, it is simply an intuitive assessment based on experience. But, you know what – that’s fair enough, considering that the most popular languages that are in use right now don’t support closures (Java, C++). When the language you use day in day out doesn’t support a concept, that concept is not going to be high on your agenda; infact you may not even be aware of it. And yes, I agree, good developers will know several (or perhaps many) different languages and there are plenty out there that do support closures – you’re bound to stumble across one at some point. Which just goes to show that when most developers learn new programming languages, they go about it completely the wrong way and so when you hear someone say they know a dozen languages, they are likely overstating the situation by quite a margin. But, I’ll leave that discussion for another time – today it is all about closures.

You probably came across closures when you were at uni, you just don’t remember. Over the last 10 years or so, the curriculum of software related degrees has evolved to the point where concepts like closures are not emphasized, it is predominantly a functional concept and functional languages are out – Java is in. They were trying to make the degrees more industry-relevant and as a result there is a generation of programmers who have been professionally crippled (to some extent) through no fault of their own. You trust the education system to do right by you and there isn’t much you can do when this trust is misplaced. I need go no farther than myself for an example. I did Java in the first year of my degree, C in the second. The first time we were exposed to functional programming, was during an AI subject where we had to do some Lisp along with having to learn a whole slew of other concepts. Needless to say, functional programming wasn’t the emphasis in that subject; many people struggled and struggled badly. We certainly had some exposure to closures during that subject, but who had time to think about that, it was all most people could do to try and wrap their heads around a new style of programming when they had just spent two years getting used to a completely different one. By the time I found myself in industry, closures – along with most other functional concepts – were long forgotten. It took me years of self-study before I rediscovered this stuff – stuff that should have been the foundation of my education as a software developer. I once had a go at my CS degree for not teaching some things that would have made it more relevant. Were I to write a similar post these days, I would be a lot harsher – I still might.

The bitter irony is that while Java is still well and truly in, the functional style is coming back with a vengeance. Fully functional languages (Clojure) and functional hybrids (Scala) are in vogue and so the concepts that underpin them are once again relevant. But, the damage has already been done and will continue to be done – academia does not adjust itself in a hurry. You can’t afford to be ignorant of functional concepts any more, even JavaScript – that Web 2.0 language that we all love to hate – is a functional hybrid. And so developers scramble to get their heads around these concepts and as they scramble they are presented with stuff like this:

“In computer science, a closure is a first-class function with free variables that are bound in the lexical environment.”

What the hell does that even mean! Wikipedia does go on to explain it a little bit better, but without some functional context, it is still tough going. Here is another one you commonly hear:

“A closure is a function that is said to be “closed over” it’s free variables”

Really! A ‘closure’ closes over, why, that makes it all much clearer. Seriously, the next question is always bound to be a plea for clarification. An explanation like that is worse than no explanation at all, although it certainly does make someone sound smart. But, when you’re trying to teach someone something it is not about self-aggrandisement – we can do better. Let’s give it a go.

A Concept Explained Simply

A closure is basically a function/method that has the following two properties:

You can pass it around like an object (to be called later)
It remembers the values of all the variables that were in scope when the function was created. It is then able to access those variables when it is called even though they may no longer be in scope.

Let’s fill in some more details. As you may have guessed, you don’t get closures for free; they must be explicitly supported by the language. In order for the language to be able to support closures, it must support first-class functions. A first class function is a function that can be treated like an object in that you can store it in collections and pass it as a parameter to other functions. As I said, the ability to be passed around is the first property of a closure.

A normal function is defined in a particular scope (i.e. in a class) and can only be called within that scope. This function has access to all the variables in the scope that it is defined, like the parameters that are passed into it as well as class variables. A closure on the other hand may be defined in one scope and be called in a completely different scope (since we can pass it around before calling it). Because of this, when a closure is created, it retains the values of all the variables that were in scope when the closure was defined. Even if the variables are generally no longer in scope when the closure is called, within the closure they still are. In other words, the closure retains knowledge of its lexical environment at the time it was defined.

Hopefully that “lexical environment” sentence above is starting to make a little bit more sense now, but I am sure it would make a lot more sense if we had an example.

An Example In Ruby

No explanation is complete without some examples, that is usually what it takes to make things start to fall into place. We will use Ruby since it supports closures and I like it.

In Ruby, closures are supported through procs and lambdas. These constructs are very similar, but there are some subtle differences (I might do a post about that soon). Let’s create a closure and see how it fulfils the two properties described above.

```ruby class SomeClass def initialize(value1) @value1 = value1 end

def value_printer(value2) lambda {puts “Value1: #{@value1}, Value2: #{value2}“} end end

def caller(some_closure) some_closure.call end

some_class = SomeClass.new(5) printer = some_class.value_printer(“some value”)

caller(printer)```

When we execute we get the following output:

alan@alan-ubuntu-vm:~/tmp$ ruby closures.rb
Value1: 5, Value2: some value

As you can see, the _valueprinter function creates a closure, using the lambda construct, and then returns it. We then assign our closure to a variable and pass that variable to another function, which then calls our closure. This satisfies the first property of a closure – we can pass it around. Notice also that when we called our closure, we printed out “5” and “some value”. Even though both the @value1 and value2 variables were both well and truly out of scope in the rest of the program when we finally called the closure; inside the closure they were still in scope as it retained the state of all the variables that were in scope when it was defined. And so, our lambda satisfies the second property also which makes it a closure. I hope the example has made things a bit clearer.

Of course, we can look into closures a little bit more deeply. For example, how do they retain the values of the variables that were in scope when the closure was defined? This must be supported by the language and there are two ways to do that.

The closure will create a copy of all the variables that it needs when it is defined. The copies of the variables will therefore come along for the ride as the closure gets passed around.
The closure will actually extend the lifetime of all the variables that it needs. It will not copy them, but will retain a reference to them and the variables themselves will not be eligible for garbage collection (if the language has garbage collection) while the closure is around.

If the language supports the first way, then if we create two or more closures which access the same variables, each closure will have its own distinct copy of those variables when it is called. If a language supports the second way, then all closures will reference the same variables, i.e. they will in effect be dealing with exactly the same variable. This is how Ruby does things. Here is an example:

```ruby class SomeClass def initialize(value1) @value1 = value1 end

def value_incrementer lambda {@value1 += 1} end

def value_printer lambda {puts “value1: #{@value1}“} end end

some_class = SomeClass.new(2)

incrementer = some_class.value_incrementer printer = some_class.value_printer

(1..3).each do incrementer.call printer.call end```

This produces the following output:

alan@alan-ubuntu-vm:~/tmp$ ruby closures.rb
value1: 3
value1: 4
value1: 5

We create two closures this time, one to increment a value and the other to print it out. When we then call both of the closures three times, we can see that both are operating on the same variable as the value is incremented on every iteration. If Ruby handled retaining variables for closures by copying them, we would have had the original value of 2 printed out on every iteration since our incrementer and printer closures would each have had a distinct copy of the @value1 variable.

Why Are Closures Useful

Closure

Now that we understand closures a bit better, what’s the big deal about them anyway? Well, it depends on what kind of language you’re using. In a functional language they are a very big deal. Functional languages are inherently stateless, but we can use closures to essentially store some state which will persist as long as our closure lives on (i.e. if the closure changes the value of a variable it will retain the new value the next time the closure is invoked). I hope it is reasonably self-evident how this can be useful. The existence of constructs such as closures along with several others, allow functional languages to be very terse in expressing logic which means you can do more with less code.

With non-functional languages things are a little murkier. There are better ways to represent state when it comes to imperative languages, so the only thing that makes closures compelling in that situation is the fact that you can use them to write terser code while still taking advantage of the imperative style. The existence of closures is partly the reason why people can often do more with less code in languages like Ruby (which support closures) than they can in languages like Java (which do not). If you have other things to add about the awesomeness of closures in any context, then do leave a comment, the more information we share the better off everyone will be.

Images by Tattooed JJ and ~Oryctes~ (Off)

Who Deserves The Credit For Software Craftsmanship and Great Design?

12/05/2010 1664 words 8 min read

Common Sense How did people design great software before OO and how did we ever manage to run a successful project before Agile came along? Those are the questions young programmers undoubtedly ask themselves at some point in their career (I know I did, early on). You hear about and read horror story after horror story from the dark ages of software development (sometimes the dark ages are years ago, other times they include the project before this one) and then you hear about how things like OO or Agile are transforming (or have transformed) the industry. The truth is, projects didn’t fail due to lack of Agile or OO (or any other ‘revolutionary’ practice) and there were plenty of successful projects even in the bad old days (it’s just that you don’t tend to hear about the good ones). Don’t get me wrong, I believe in what Agile stands for and I haven’t even seen non-OO software in my career :), but I don’t subscribe to the buzzwords, I subscribe to what these concepts stand for, the core ideals if you like. Sometimes, you need to take a step back and consider that while the moniker may be shiny and new, the concepts it tries to package are often time honoured and fundamental (or as time honoured and fundamental as you can get considering the relative youth of our industry).

I was thinking about the Unix Philosophy the other day and it struck me that pretty much all of the ideas seem really familiar. So, I went and cracked open a copy of “The Art Of Unix Programming”, just to make sure I wasn’t getting myself mixed up. And what do you know, it was just as I remembered, but this time, I had a bit more experience to put it all in perspective for me. Here are a couple of quotes by Doug McIlroy:

“Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new features.”

Why, he is simply espousing making your software modular as well as giving a nod to separation of concerns. I dunno about you, but I keep something similar in mind when designing my classes, small, self-contained, I thought it was all OO principles :).

“Design and build software, even operating systems, to be tried early, ideally within weeks. Don’t hesitate to throw away the clumsy parts and rebuild them”

Release early, release often. Refactor and leave the code better than you found it.

“Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you’ve finished using them”

Automate everything. Does it smell like agile in here or is it just me.

Let’s keep going we don’t even need to turn the page. Here is what Rob Pike had to say

“Rule 1. You can’t tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don’t try to second guess and put in a speed hack until you’ve proven that’s where the bottleneck is.

Rule 2. Measure. Don’t tune for speed until you’ve measured, and even then don’t unless one part of the code overwhelms the rest.”

Hmmm, avoid premature optimisation.

“Rule 4. Fancy algorithms are buggier than simple ones, and they’re much harder to implement. Use simple algorithms as well as simple data structures.”

YAGNI. Even more agile wisdom. You have to remember that this stuff is from the 70s, years before OO got any traction and decades before the agile movement was even on the horizon. And these quotes are a distillation of thought, after these people had time to mull it over; they were actually doing this stuff years before.

Eric S. Raymond, the author of “The Art Of Unix Programming”, summarised the Unix philosophy in the following 17 tenets.

Rule of Modularity: Write simple parts connected by clean interfaces.

Rule of Clarity: Clarity is better than cleverness.

Rule of Composition: Design programs to be connected to other programs.

Rule of Separation: Separate policy from mechanism; separate interfaces from engines.

Rule of Simplicity: Design for simplicity; add complexity only where you must.

Rule of Parsimony: Write a big program only when it is clear by demonstration that nothing else will do.

Rule of Transparency: Design for visibility to make inspection and debugging easier.

Rule of Robustness: Robustness is the child of transparency and simplicity.

Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.

Rule of Least Surprise: In interface design, always do the least surprising thing.

Rule of Silence: When a program has nothing surprising to say, it should say nothing.

Rule of Repair: When you must fail, fail noisily and as soon as possible.

Rule of Economy: Programmer time is expensive; conserve it in preference to machine time.

Rule of Generation: Avoid hand-hacking; write programs to write programs when you can.

Rule of Optimization: Prototype before polishing. Get it working before you optimize it.

Rule of Diversity: Distrust all claims for “one true way”.

Rule of Extensibility: Design for the future, because it will be here sooner than you think.

Let’s pick out some choice tidbits. How about number 4, separate interface from implementation, I think we’re all well drilled in that one these days :). I am really fond of number 8, this is the stuff that most agile developers live by, robust, simple and clear code (you can take robust to mean well tested if that is your preference). Number 15 is a good one, anyone do spikes when they code – I know I do? Number 16 is my favourite, I interpret it as “if something doesn’t work, stop doing it and do something else”, reminds me of Scrum (or as Conan the barbarian would say it “SCrom” :)). Much of it is about keeping it simple, i.e. the KISS principle, just like many of the concepts behind agile. But of course, just because we’re keeping it simple, doesn’t mean we are foolish about it (it’s keep it simple stupid not keep it stupid stupid), which is where number 17 comes in.

So where am I going with all of this, did the dudes who wrote Unix invent agile (or at least the underlying ideals), or software craftsmanship, or great software design? It’s unlikely, after all Don Knuth said:

“Premature optimization is the root of all evil.”

He said this in 1974 (the link is pdf) – but was likely thinking it for years and there were probably people before him as well. It all had to start somewhere, of course, as I said; software is a relatively young industry. But, it doesn’t really matter who the first person was that came up with these ideas, the fact that the ideas continue to figure prominently in newer concepts tells me that they are just basic common sense. When programming was born as a career, smart people came along, did the first project or two and quickly twigged on to what seemed to make sense and what didn’t. All the nonsense, confusion and failure was mostly from a bunch of other people trying to fit software development into moulds it wasn’t suited for – refusing to realise that this was an industry fundamentally unlike any that has come before and must have its own “rules” (rather than trying to inherit some from manufacturing or random other industries that “looked” similar). The saddest thing was that these people had power and could impose their will and make their ideas stick, which is why the whole industry went to hell for quite a while (even to this day), with signs of recovery only now beginning to emerge.

So Does Agile Get Credit For Anything?

Christianity and Islam are some of the most popular religions in the world. This should surprise you, since they are much younger than many other religions (especially Islam). Do you know why these religions are so popular? Because they were the first missionary religions, the first to actively seek out new converts for their faith. Most of the religions that came before, either did not actively welcome outsiders or were actively hostile to them. These religions are a testament to the fact that if you actively try to seek out converts and don’t give up, even due to major setbacks, you will slowly gain more and more followers and will eventually spread your message far and wide.

The agile movement was the beginning of the “missionary work” for these common sense software development concepts that we have now all come to know and love. Up until then, there was no focus, no organisation, and even when there was organisation, there was no attempt to seek out converts (other developers) and actively spread the message about how software should be built. The agile movement changed all that, which just goes to show that developers are just people and are willing to buy into something that seems sensible and hits close to home. But, the message itself was never revolutionary, just like the religious messages were never revolutionary. It was simply a bunch of common sense concepts that many good developers have followed for years.

What I am trying to say (besides some curious factoids and a history lesson :)), is this. Don’t faddishly follow “best practices”, remember rule 16 – there is no “one true way”. It doesn’t matter what tech you’re using, it doesn’t matter what the configuration of your task board is. The key is to surround yourself with good people (this really is key and deserves a separate post) and to always remain aware of the situation – then figure out what makes sense and just do that (even if what makes sense is to follow best practice or to do the exact opposite :)). Good programmers have always known the score.

Image by RobertBasil

««
«
2
3
4
5
6
7
8
9
10
11
12
13
14
»
»»