Eugene Ciurana Official Site

UNIX: When Sloppy Works Better

UNIX sloppy command usage can render better results than mastery.  This could be a huge win in Dev Ops situations.  The best way to evaluate how good an implementation is, is to ensure that it works, it does the job fast, it's easy to understand and maintain, and it's fast to implement.  This post is about how a sloppy solution works better than the elegant, guru-level solution that I got from someone with better UNIX kung f00 than me.

I had a need to count every RTF file on my Mac workstation's mounted file systems.  I've been using UNIX-like operating systems since 1986 (full time since 1990), but some commands I just can never remember.  Therefore, I rolled with this:

sudo find / | \
    awk '/\.rtf$/ { n++; } END { printf(" %s\n", n); }'

Simple, it does the job...  and it made a UNIX guru friend cringe.  "Dude, just use the find command switches and count the lines.  That's horrible."  After a little man page digging, I came up with this:

sudo find / -type f -name "*.rtf" | wc -l

The command seemed to take a lot longer to complete.  I blamed some inefficiency in wc, and this time I timed the commands:

sudo find / -type f -name "*.rtf" | wc -l
    4140
real 2m12.820s

Against the sloppy version:

sudo find / | awk '/\.rtf$/' | wc -l
    4140
real 1m0.758s

The "sloppy" version is twice as fast, even if we pipe through wc!

Conclusion

I always hated using find, and this little experiment gave me two more reasons for not bothering to learn all its command line switches:

  • Speed and correctness of execution trump elegance - I don't give a shit if find's switches are "better" - they are hard to remember and this shows that find is much slower
  • Speed of delivery trumps elegance - The second command is easier to remember, easier to type, and it gave me the results I was looking for in half the time

This is just one example of how sloppy UNIX usage can work better than expert level mastery.  Keep this in mind next time some purist tells you, "but that's not the right way!"  Test which way works better, then decide if you want to continue writing sloppy commands or if you need to learn "the best way".  Your yardstick, in both cases, is which gives the most accurate result in the shortest possible time.

By the way:  the O'Reilly sed & awk book features a slow loris and a tarsier on the cover.

Cheers!

Read more »

Latest entries

UNIX: When Sloppy Works Better

UNIX sloppy command usage can render better results than mastery.  This could be a huge win in Dev Ops situations.  The best way to evaluate how good an implementation is, is to ensure that it works, it does t…



Things People Decide Within 8 Seconds of Meeting You

First impressions are important and skin deep.  It's what psychologists call thin slicing, or the ability to form judgments and make decisions on very small information bits.  According to research published b…


NoSQL Done Weird: GeoSpock

I just learned about a NoSQL database called GeoSpock, and was asked why it wasn't featured in my recent NoSQL Refcard.  Here's an introductory video: The merits of the database are questionable, at least ba…


NoSQL and Data Scalability 2.0

NoSQL and Data Scalability 2.0 is now available for download, my latest DZone Refcard! This Refcard introduces modern NoSQL and Data Scalability terminology and techniques and exhibits in-depth examples of popular No…


The 45 Mark: Why the Mad Men Finale Was Great

1972, central Mexico. My family and I had just moved to a new home. I was about to turn 6. The brand new color TV was just hooked up, and the first thing that comes up after the exterior antenna is connected and the ima…