Monday, March 28, 2011

More Bullshit Code!

This code is actually from another project. I'll eventually use it in the troll project. It's kind of simple but it's just so damn awesome to me from a geek perspective:

Click the picture above to see it better.
What does it do?

Answer: Anyone who has been around long enough to have suffered through writing a report in COBOL may remember the ancient concept of a "control break" - simply the point where upon reading a record you test if the condition has been met to print a sub-total, accumulate a grand total, etc. In this code, I've got a bunch of files in a directory, I select the files only (excluding anything else like subdirectories) and sort those files by their date/time stamp. Then I produce a report of a tally of files by the day they were placed in this directory. The "control-break" happens right in the middle of the code snippet above - the "if" clause. Could you ever "push" something in COBOL? The end result of this code is an array of hashes. Any number of table formatters (like the one I use for the troll project) can spit out something halfway readable. Like so:


The geek value is in how, with so few lines of code these days, you can express something that required so much more sweat in the bad old days. What is even funnier is that you could have done something like this even back then - with Lisp. I wonder if some shop somewhere in those days used Lisp for mundane data processing. Probably only a handful - memory cost a huge amount of money in those days.

I was watching a talk by Ruby superstar Aaron Patterson and he said this is much faster:

Click on the above to see it better.
Inject is expressive but slow apparently. I haven't had a chance to benchmark this particular code.

Inevitable humbling update: I was watching another videocast and what I'm doing above has a simple name: a histogram. The vidcaster of course had a butt simple way of doing the above. No fancy array of hashes, pushes, maps or injects - just one hash:

Click above to see it better.
Hashes are ordered in Ruby 1.9.2. so the end result comes out ok. Older rubies take a bit more work. You learn something new every day.