matt spitz

Thoughts on software development, New York tech, music, and various permutations of the aforementioned.

I scream, you scream, we all scream for… unit tests?

I’m going to go out and say it.  I love unit tests.  Hopefully, by the end of this post, you will, too.  I’m not going to cover the pros and cons of various frameworks in each language, the best way to mock out data dependencies, or even the secret to writing impeccable unit tests.  There are enough resources, and I hope that you’re inspired to seek them out.  In case you aren’t convinced that unit tests aren’t the best thing since sliced bread, I’ve brought my old friends from Office 97, the Screen Beans people, to help me.

What’s a unit test, anyway?

A unit test is a little snippet of code that one writes to verify that some functionality one just implemented works as expected.  This test is never run in production, just when you want to confirm that some code you wrote before still works.

As an example, say you wrote this awesome function in Python.

def isAnApple(thing):

    return isinstance(thing, Apple)

You could write a corresponding test for that function (using Python’s unittest module).

def testAppleChecker(self):

    myApple = Apple()

    myOrange = Orange()

    self.assertTrue(isAnApple(myApple))

    self.assertFalse(isAnApple(myOrange))

If your isAnApple() function works, testAppleChecker() passes.  If it’s broken, testAppleChecker() fails.  Easy peasey, lemon squeezey.  You can even write a nice little script that runs all of your unit tests.  Or even a build rule that won’t deploy your code unless all unit tests pass!  But I get ahead of myself.

But I just wrote all this code.  Why do I need to write a stupid test for it?  It’s such a waste of time.

Whoa there, buddy.  Calm down.  You’re right.  In my example above, I used 4 lines of testing code for the 1 line of actual code I wrote.  And what if I decide I don’t actually want isAnApple() anymore?  Or if I refactor it away?  Not only did I just waste all that time writing my unit test, and I’m going to waste even more time getting rid of it!

The truth is that the extra few minutes you spend keeping your unit tests up-to-date aren’t going to impact your productivity all that much.  As it turns out, for most developers, the majority of their time is spent thinking about the problem and designing a solution, not actually typing it out.  In general, writing tests for a chunk of code requires less thought than writing the code in the first place, and you should know how that code is meant to behave, so the tests should almost write themselves.  But in case you still think that your time is better spent on Reddit, I’ll list the Six Awesome Benefits of Unit Testing.

Benefit #1: Forcing yourself to write testable code

Decomposition, breaking your problem into small pieces, is one of the first lessons in an introductory programming course.  Incidentally, it’s one of the first lessons forgotten.  You write a function, and, as it needs to do more, you keep bolting on functionality to it until it gets unwieldy.  For example (again in Python), you start with something like this:

def f(parameters):

    do something great

And before too long, it ends up like this:

def f():

    cfg = getConfig()

    if cfg.setting:

        do something awesome

    elif cfg.othersetting:

        do something else

        return

    else:

        ok one last thing

    if cfg.shouldcleanup:

        clean up

Your unit test, covering all possible execution paths of f(), gets even more noodly, and you can’t quite be sure if it even covers all the cases it needs to.  But because you’re Captain Unit Test, you break up f() into more manageable, testable pieces.  Your code is more readable and easier to test for correctness.  Everybody wins.

Benefit #2: Faster development iteration

Without unit tests, a typical development flow on a web application might go something like this:

  1. Work on some piece of your application.
  2. Fire up the server with a new build.
  3. Try to access the application in some way that uses the code you just wrote.
  4. Lather, rinse, repeat as desired.

Depending on the extent of your application, compiling and running your code may take some time, and it’s not always easy to prod your application in just the right way to be sure that it’s correct.  In some cases, testing each use case may involve restarting your server to give it a clean slate.

In contrast, you can write a unit test or two to cover each use case.  Instead of running the whole server, you can just run those tests and be guaranteed that it’s tested exactly as you specified.  And even better, when you write a new feature, you don’t have to go back and test every previous feature.  Assuming you write good unit tests, you can save time and just run the test suite.  So, all that extra time you spend writing tests in fact pays off as you iterate on your application!

Benefit #3: Built-in examples

So, you have a new developer on your team.  Or someone else wants to use that hot new library you just wrote.  Either way, you need to find some way to explain your code.  Your unit tests are simple examples of using the code you’ve written and are a great place to start for those who’re getting to know your codebase.  While you still may need to help out your new admirers, you’ve lowered the grade of the learning curve.

Benefit #4: Once-and-for-all bugfixes

OK, so a bug slipped through your unit tests.  Or perhaps it wasn’t a bug, but “unexpected behavior”.  Assuming it’s not bringing down your entire site this instant, don’t fix it yet.  Instead, write a unit test that exposes the bug.  Make sure that the unit test fails.  Then, fix the bug and ensure that the unit test passes.  This idea, borrowed from the process of Test-driven development, is awesome for a few reasons.

First, that specific bug isn’t going to happen again.  You wrote a unit test that fails when that bug occurs and passes when it doesn’t.  Donezo.

Second, writing a unit test to expose the bug has taught you exactly what triggers it, perhaps revealing a better solution to the underlying problem.  For example, if the program chokes on the input “Gruß Gott!”, instead of simply disallowing unicode (which may bother your international users), you might instead find out the proper way to handle unicode.  You’ll have a long-term fix and be done with the bug for good!

Benefit #5: Developing good internal APIs

One true downside of having a comprehensive test suite is that a new parameter to the critical path can have you updating a lot of your tests.  This is, indeed, a huge pain.  But once you have to do it the first time, you’re much more careful about having well-designed internal interactions among the components of your application.  Ideally, your application is decomposed to the point where rewriting the interface to a particular component doesn’t have a cascading effect on the codebase.  Investing in writing thorough unit tests for each component ensures that you’ll design it to minimize changes to its interface.

Benefit #6: Comprehensive regression tests

Sometimes, you have to make that sweeping change to rearrange the components of your application.  Perhaps your codebase represents many developer-months of work, and you’re not sure if you just re-introduced all of those weird corner cases you’ve fixed and forgotten.  If you’ve been diligent about covering every nook and cranny of your application with unit tests, you can be reasonably confident that when those unit tests pass, your application is as correct as it was before the change.

That doesn’t just go for large changes.  Every incremental change can affect other pieces of the application, no matter how well-decomposed it is.  Particularly when working with a team, having a complete regression test to run with every new feature is a great comfort.  Even better, if the entire suite is run automatically before allowing every code checkin, you can rest assured that whatever’s in the repository is clean and ready for release.

Beware!

Don’t get too cocky.  Your unit tests are only as good as their author.  If your tests aren’t written correctly or don’t cover all possible execution paths, you can still have bugs in your code.  Unit tests are not a replacement for end-to-end tests or your QA team.  Weird things can happen when all the pieces of your application are running together with production data, and you can’t write a unit test for everything.  But you, and the long-term development of your application, much better off for it!

On the value of recommendations

Recommendations have been all the rage for the last couple of years now, and since I’ve been working for Hunch (now eBay), I’ve been thinking about why they’ve become necessary and the value good recommendations bring to the piles of content on the Internet.  Searching for information we’re looking for is basically a solved problem, and it’s relatively easy to carve up existing datasets such that they’re easily consumed.  Surfacing interesting content proves to be a problem, though.  Take Yelp, for example.  If I want to find out when Wildwood BBQ (a fine watering hole, if I do say so myself) is open, I search for Wildwood BBQ, and it’s the first result.  I learn not only when they’re open, but what kind of parking is available, whether they are wheelchair accessible, the nearest subways, and all sorts of wonderful things.  A+.  If I didn’t know about Wildwood BBQ but instead were interested in all of the places near Union Square in the $$ price range that served beer and had a TV, Wildwood BBQ would show up.  But here comes the kicker.  This is New York.  There are a ton of relatively cheap places to drink and watch TV.

Now the problem becomes deciding among the available options, and for the most part, this task is up to me.  Yelp is a very popular service, but unfortunately, this means that all kinds of people leave reviews, many of whom I probably disagree with.  So, the average Yelp rating isn’t all that helpful.  At this point, if I really care about where I go, I have to dig through the reviews for each place, deciding, based on the user’s photo, name, and review text, whether I trust that person and his/her rating for each venue.  It’s rather tiring and has greatly reduced the value of Yelp for me.  In reality, most of these places would serve my needs just fine, so I probably could safely choose a bar at random, but on the off-chance that it doesn’t work out, I have only myself to blame.  I don’t mean to pick on Yelp, as they see this issue and have the beginnings of a decision-making feature, but it’s a pretty good illustration of the need for recommendations.

There are basically two ways to surface interesting content for users, by subscription and algorithmically.

Subscription-based content has been around for awhile, and there are basically two ways to go about it.  Social services like Facebook and Twitter allow me to subscribe to content posted by certain people.  The Yelp analogy would be subscribing to restaurant reviews by a particular person, which exists in the form of user profiles.  This assumes that I trust everything a particular user likes.  To continue the Yelp analogy, while I may enjoy the sushi recommendations for a particular user, I may not be interested in their Indian food adventures in Murray Hill or their recommendations outside of New York City.  Such is the story of my Twitter feed.

The other subscription-based approach is what sites like Reddit utilize.  Rather than subscribing to people, I subscribe to particular topics.  When articles are posted to the “programming” subreddit, fellow users curate them by voting them up and down, and the “good” articles bubble up to the top.  This puts the burden on the user to specify what they like.  An analogy on Yelp would be subscribing to reviews for all sushi restaurants in the New York City area.  This is a good start, but we now have a problem of specificity.  The more specific the interest, the better the recommendations are, but the fewer people there are that have that particular interest.  For example, if I’m interested only in sushi restaurants that serve Sapporo on draft in the West Village, either I find myself in a very small group of people with no recommendations, or I find myself in a more general group, filtering out all the restaurants I’m not interested in by hand.  Still, not ideal.

The new hotness in the last few years has been algorithmic recommendations, as evidenced by the number of companies actively implementing them.  This approach uses implicitly-expressed preferences (in the form of article views, item purchases, reviews written, etc) to predict my specific interests.  Amazon, for example, will recommend similar or complimentary items based on other users’ purchases.  The users themselves don’t explicitly tell Amazon what they like.  Instead, their actions dictate what Amazon thinks about them.  Consider that the next time you purchase the new Miley Cyrus album “for your niece”.  Amazon isn’t alone here.  Nearly every company with data is trying to do this with their data to surface good products, articles, advertisements, etc.

Even with algorithmic recommendations, there are two ways to go about it.  Amazon, it appears, focuses primarily on the item similarities.  That is, if I’m looking at a camera, they might recommend lenses because the people who’ve bought that camera tend to buy lenses.  Users who are new to Amazon get the same value out of these recommendations as those who’ve actively used the service for years.  The alternative is to use personalized recommendations, which I don’t believe Amazon focuses on to the same extent.  For example, if I’m looking at a particular camera, Amazon may show me red cases with little pictures of guitars on the sides because I buy red things and guitar accessories.

To me, the biggest value-add of algorithmic recommendations is that they minimize the amount of work that I have to do to make a decision.  Deciding among bars in Union Square is something that Yelp compels me to do, but it isn’t necessarily the best use of my time or brain space, particularly for such a low-importance adventure.  Similarly, I don’t necessarily care what brand or thickness of socks that I wear, so I’m not likely to second-guess a recommendation given to me.  It’s not always the “best for me”, but sometimes, not having to think makes it all worth it.

The flipside of algorithmic recommendations is that it’s often hard to explain why certain things are recommended.  For example, Amazon may recommend a certain pair of headphones over another because people from New York who also browse Amazon on Thursday evenings and have similar click-streams to me have bought them, but that’s particularly difficult to explain, and depending on the algorithm Amazon is using, it may not even be able to do so.  In contrast, if I pick whose recommendations I follow or the interest groups to which I’ve subscribed, why I see a particular recommendation is obvious.

So, as with many things in life, there are a ton of options when it comes to recommendations, and each has its own advantages and disadvantages.  But the value of recommendations is very clear, given how much companies are invested in producing quality recommendations.  Where search defined Web 1.0 and social has defined Web 2.0, discovery is likely to be remembered as the next focus of the Internet.  I’d call it Web 3.0, but that’s hackneyed, to say the least.