Sunday, March 06, 2005

Encapsulation, Performance, and "DUH!"

I've been teaching a fellow programmer of the ways of the square bracket knights (I am but a mere squire, I know, please bear with me!) and he showed me some code that looked like the following:
    | sorted |
    sorted := SortedCollection sortBlock: [:a :b | a greatAttribute <= b greatAttribute].
    someHugeFile linesDo: [:each | sorted add: (CoolObject from: line)].
    sorted inspect

Nothing wrong with that code, right? Well, my gut instinct was to rewrite the code thusly:
    | toBeSorted |
    toBeSorted := OrderedCollection new: someHugeCollection size.
    someHugeFile linesDo: [:each | toBeSorted add: (CoolObject from: line)].
    (toBeSorted asSortedCollection: [:a :b | a greatAttribute <= b greatAttribute]) inspect

The only real difference here is that I used an OrderedCollection and then sorted. As I started to mention this to my friend, I heard the quote, "Premature optimization is the root of all evil!" I stopped myself cold. I knew the only change I was going to make was because of performance reasons. The next thought I had was, "Why does SortedCollection keep itself sorted all of the time?" I mean it is an object. Why not only sort when it is needed (when you send do:, collect:, reject:, etc)? Then, I could use a SortedCollection without having to worry about performance of sorting a large collection piece meal. It turns out to be a trivial change (simply override do:, add a flag to mark when sorting is needed, and that's it!) We get the performance improvement and the code above stays like it was originally. The way it should be. The user of the object doesn't need to worry about performance, the object does that work for them.

No comments: