Thursday, November 6, 2014

When simple programming tasks become epic trials simply because you don't know what you're doing

I want to get some data from one part of my web program to another.  What I thought would be a trivial little task has turned into 7 and a half hours of work over the last week (counting blogging about it), which is not much shorter than the entire Lord of the Rings trilogy on Blu-Ray.  I'm going to tell you this saga because it is a very common happening with computer programming: you think you are (metaphorically) popping out for five minutes to drop your Netflix in the mailbox and grab some Mountain Dew and realize, days later, as you dodge orcs and hide from Nazgul, that you are guiding two hobbits with some Fingerhut ring and a stalker problem across Middle Earth and into Mordor.  This phenomenon plays hell with estimation, drives non-programmers batty, and goes a long way to explaining where there can be a 10x or 100x difference in productivity between two programmers.  We're going to go through my sad little example to try and give you, a non-programmer, a helpful peek into the abyss.  Don't worry, I'll blur all of the computer code.

Let's begin with a trailer featuring a scene from the third movie, when Sam and Frodo are dodging Shelob, the giant spider.  In my version, Frodo is played by the number 28, and Sam by 29.
<QueryDict: {u'notes': [u''], u'csrfmiddlewaretoken': [u'JJUhgrMzA3gtsMPnmzUsJYn4J6doONt4'], u'name': [u''], u'tags': [u'28', u'29']}>
See them stuck in Shelob's web?  I try to get them out of there using this magic incantation:
(Pdb) print self.request.GET["tags"]
And all I get is:
29
Obviously Sam can't go on without Frodo.  Our story cannot proceed.  Disaster!  Or, to abandon the increasingly tortured and confusing metaphor, I'm trying to complete Task 154, which visually is this:


This is a task from WhatNext, the nihilistic task management website that I've been progamming as a hobby, in between the knitting.  It's a task that has completely stalled progress for, as I've said, a week.

Our story begins with this innocent-seeming web page.  This is an example of the WhatNext task list:


See the little up and down arrows?  When you click, the task should move up or down.  And that works.  But.  I also have these nice tags, so that you can look at only a little bit of your list:

And I need to keep track of those tags for almost everything, or else things go wonky.  Which means, in the grossest terms, that that bit highlighted in the red box needs to go down where that red arrow is pointing.  AND IT WON'T GO THERE.

There are two kinds of programming: programming that does something never done before, and the other kind, re-inventing the wheel.  In a sense all programming is the first kind, because if something already does what you need, why not just use that?  There should never be a need for the second kind.  But in practice there are obstacles large and small.  For example, the program you need may be very expensive and you only need a fraction of what it does, so you take a cheaper program and try to extend it a little bit.  Or you have a Mac and the program that does what you need is only for Commodore 64.  This dilemma applies when you choose a program to use, and it applies when you are programming.  All programming is done on a foundation of what has come before; you don't typically rewrite an operating system, or email, or the alphabet.  And when you are creating a data-backed website, you are writing the same kind of program that everybody's been writing for two decades, which is only a slight change to client-server programs for decades before that, all the way back to mainframes and terminals.  So most of the idioms and patterns and best practices have been codified long since, in the form of standards but particularly in the form of languages and platforms and libraries.  One need only choose and master a platform in order to start out, in theory, 90% finished, with only the particulars of your situation remaining to be programmed.  So programming is typically not inventing a new solution to a problem; it's figuring out the most efficient ways to apply existing solutions to old problems for new customers.

Which is why it's so unbelievably frustrating to get stuck trying to do something in Django so obvious and common as taking some parameters from the incoming URL and sticking them in the outgoing URLs so that the user does not lose their metaphorical place when they click on things.  There are some best practices to handling URL parameters, to prevent problems like mangling the spaces between words, or allowing a Ukranian rebel hacker to steal your credit card and denouce the West on your home page, but those just make it more likely that Django (the platform I've chosen for my program) would have this functionality built it. Which means that I've spent most of the 7+ hours googling for examples that are close to my problem, solve a piece of my problem, but don't quite solve all of it.  Django provides many, many ways to do anything.  Which means it doesn't provide a right way to do something, only ways with differing degrees of appropriateness.

Let's get specific.  First, I have to capture any data the user provides in the URL.  That is, if the incoming URL says that tags 28 and 29 are in use, I need to store "28" and "29" somewhere.  Then, I have to make sure they are really plain numbers, and not some cryptic bit of code that tricks my computer into divulging the launch codes.  Then I have to prepare them for the outgoing URLs, which in this case means turning them into the sequence of characters &tags=28&tags=29.  And then I have to stick that sequence of characters at the end of all of the URLs on the page, such as the "Move Down" button and the "Move Up" button.

I know how to do almost all of this. I've done it before in this program, everything but sanitizing the input, and I've glimpsed bits of that so I'm confident I'll find tools for it in the platform.  Nothing here should take more than five minutes, so I'll round up and say the whole task ought to be done in an hour.

First, I have to catch the data. The standard place to capture incoming variables in Django is urls.py.  However, this makes it clear that I can't use it because the technique is not appropriate for parameters that "chang[e] the way the resource is displayed".  Another programmer might decide to go ahead and use this technique anyway, and that might work for them, or they might hit a different snag later.  Or it might work now but then cause a problem for the next programmer in a few years, when something else changes.  I'm going to play it straight and try to find the preferred way to do this.

So I look for another way to catch my 28 and 29, and find it, but when I put that code into my program, nothing happens.  Nothing happens is probably the second-most common outcome of any change to a computer program, beaten only by error messages.  Why doesn't anything happen?  I Google, and find another question along the same topic, and the answer should clarify some things for you:
... they are at the class level and are class variables. As for the NameError, where are you trying to do year = self.kwargs['year']? You should be doing it in a method, you can't do it at the class level. 

I think that that means that you can't use these particular commands in the main body of a "class", which is a particular type of mini-program within my program (which, let's recall, is itself inside of a another program, called Django, which is effectively inside of another program, called Python, which ... reminds me of a movie called eXistenZ.

Do you remember that movie? SPOILERS! It came out the same year as the Matrix and The Thirteenth Floor. In the Matrix, it turns out that SPOILERS our reality is just a computer simulation. In The Thirteenth Floor, researchers get stuck in their computer simulation, but when they get back to reality it turns out that they in turn are just part of SPOILERS a computer simulation. So that's three layers to the Matrix's two. Add a layer of complexity, shrink the box office by an order of magnitude. In eXistenZ, you simply lose count of the layers of reality, so it didn't make much money at all. In Python, each bit of program tends to run inside of another bit of program, and you can interrupt the program and run a command to show you all of the layers of program that are in effect, so that you can track down problems. Here's what you get with my program at the point of error:
(Pdb) where
  /usr/lib/python2.7/threading.py(783)__bootstrap()
-> self.__bootstrap_inner()
  /usr/lib/python2.7/threading.py(810)__bootstrap_inner()
-> self.run()e
  /usr/lib/python2.7/threading.py(763)run()
-> self.__target(*self.__args, **self.__kwargs)e
  /usr/lib/python2.7/SocketServer.py(593)process_request_thread()
-> self.finish_request(request, client_address)e
  /usr/lib/python2.7/SocketServer.py(334)finish_request()
-> self.RequestHandlerClass(request, client_address, self)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/django/core/servers/basehttp.py(126)__init__()
-> super(WSGIRequestHandler, self).__init__(*args, **kwargs)e
  /usr/lib/python2.7/SocketServer.py(649)__init__()
-> self.handle()e
  /usr/lib/python2.7/wsgiref/simple_server.py(124)handle()
-> handler.run(self.server.get_app())e
  /usr/lib/python2.7/wsgiref/handlers.py(85)run()
-> self.result = application(self.environ, self.start_response)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/django/contrib/staticfiles/handlers.py(67)__call__()
-> return self.application(environ, start_response)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/django/core/handlers/wsgi.py(206)__call__()
-> response = self.get_response(request)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/django/core/handlers/base.py(112)get_response()
-> response = wrapped_callback(request, *callback_args, **callback_kwargs)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/django/views/generic/base.py(69)view()
-> return self.dispatch(request, *args, **kwargs)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/braces/views/_access.py(64)dispatch()
-> request, *args, **kwargs)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/django/views/generic/base.py(87)dispatch()
-> return handler(request, *args, **kwargs)e
  /home/ao3/ao3/local/lib/python2.7/site-packages/django/views/generic/list.py(152)get()
-> context = self.get_context_data()
-> /home/ao3/ao3/whatnext/views.py(96)get_context_data()
return context

Each -> is another layer of context, a bit of previously existing program that is part of the platform of Django and does some essential bit of the process of serving up a database-backed web page. So part of my time is spent checking the most superficial layer, and maybe one or two layers below that, to see what's going on and look around for where 28 and 29 might be. But I also spent a few hours (not included in the 7.5) getting the development site working on my laptop, so I could work on this in coffee shops and airports, and then on my laptop this particular debugging tool wasn't working right, so that was another thirty minutes fussing. And as part of getting the site running on my laptop, I went to use a backup copy of the data to put on my laptop and realized that I didn't have automatic backups running, so that was another thirty minutes to get that going, and if I'm conscientious, another 30 minutes in a week or so to make sure backups are actually happening.)

So, three hours of parathentical work later, I figured out from reading the internet that the command that retrieves 28 and 29 has to be used inside of a method, which is a mini-program inside of the mini-program. So, that's a hurdle well cleared. Now, I have to figure out which mini-program in my mini-program is the right kind.  That is to say, of the many mini-programs that Django developers have added and made idiomatic, I have to figure out which one they intended to be used in this circumstance.  In this case, I *think* it's get_context_data, because of yet another helpful question.


Also, at this point one is generally tense, bored, and frustrated, and a little stupid, so something like "Don't mix *args and **kwargs in call to reverse()!" becomes very funny: I guess although *args and **kwargs get along fine in normal conditions, in reverse() they fight and, since the **kwargs are bigger and have more asterisks, they tend to eat all of the *args.  (That's a kwarg to the right, although it may actually be Kwarg itself.)

So I think I know which mini-program I should put my commands in to get 28 and 29, but then I read that Django’s generic class based views now automatically include a view variable in the context, so maybe I don't need that first step at all, that's already done automatically and I can skip to the second step, which if you'll recall was to make sure they are safe, innocent numbers? Except that, before I can move on to that, I get stuck because when I ask for my data in exactly the right place, using what I think is the right magic command, I just get one number, not both. You may remember this scene from the trailer.

(And, to be completely clear, 28 and 29 are just stand-ins for many possible numbers, depending on what a user enters.  I'm just assuming that, if I can get 28 and 29 through to where they need to be, it will work for other numbers.  But I'm lucky that I picked two numbers, not one, to be my sample data, because if I'd picked just 28, for example, then I never would have noticed that a hypothetical second, third, or later number would just disappear.  That is, I wouldn't notice until weeks later, when I had already been using my program and any task that had one tag would move just fine, but any task with two or more tags would, when moved, go to the wrong place, and how many times would that happen before I noticed, much less diagnosed the problem?  What other potential problems am I not uncovering now because of my choice of 28 and 29 as my working test numbers?)

The reality check that keeps creeping into my brain is, surely many people have solved this exact problem before me, using this exact program?  I'm not doing anything special.  Why can't I find any giant's shoulder to stand on?  So this becomes not a programming problem, but a problem of finding the right query to lead me to a previous example.  Finally, I try "django get list from querydict", which leads to a clue (emphasis added):
QueryDict.iteritems() uses the same last-value logic as QueryDict.__getitem__(), which means that if the key has more than one value, __getitem__()returns the last value.

Which is helpful, because it's talking about exactly the place where I'm stuck. And, more importantly, it's clued me in to the fact that the tool I'm looking for, the thing built in to Django to make this problem easy, has been right under my nose the whole time. It's QueryDict. I haven't solved my problem, but I've found the right tool to use, so I guess I'll just go and read up on that now. And this, amigas and amigos, is what may be happening when your programmer goes off to do something that even they think is simple but, days later, when you ask how it's going, they just snarl and throw cheetos at you. 

This story also illuminates the discrepency in performance between different programmers.  I know in fairly specific terms what I want to happen, but have no idea how to make it happen in this particular platform, so I'm spending all of my time learning the platform, painfully.  A "1-hour" task is shaping up to be 10 hours or more.  Another programmer might not know this platform but might have developed much stronger instincts for data-backed web idioms, and would find the tools they needed much more quickly.  When I complete Task 154 and commit the new code changes, I expect there to be maybe 10 or 20 new lines of code.  Someone completely familiar with Django might have solved this task in literally the time it takes to type out those lines, no extra thought or research required.  Someone fairly familiar with Django but sloppy might have solved the task in 5 minutes but also have added in the bug we spotted, where moving a task with two or more tags puts in (silently) in the wrong place, because 28 and 29 turns into just 29 but they don't notice.  All I know at this point of Task 154 is that I'm too ignorant even to estimate how much more time it's going to take me.

Good rappin' at you, amigos and amigas.  If you take nothing else from this blog post, I hope you take with you a wonder at why you read this far.

No comments :

Post a Comment