Tuesday, September 30, 2014

Programming is understanding the user's problem and deciding how thoroughly to solve it.

I got stalled working on my program.  So I want to write about the nuts and bolts of programming and how I got stuck and maybe that will unstick things. I'm writing as a hobbyist programmer but as a professional project manager and therefore as a professional goer between users and programmers, and if you are a user struggling to understand how the sausage is made, or, more likely, why it takes so long, this may shed some light.

Programming is writing a program that spans the gap between what a person wants a computer to do and the underlying mechanical functions of the computer.  Our electronic computers can really only do one thing: receive two signals, and, if either or both are one, output one, otherwise output zero.  But in decades since electronic computers were invented, programmers have built up a vast library of programs and conventions that abstract away the ones and zeros, so that the building blocks of my program are concepts like form and field length and web page and password.

So when one programs in 2014, the specific program to write is driven by two questions: "what is the user's problem?" and "how thoroughly should I solve their problem?"  Let's analyze my stall with these questions in mind.

I'm working on Task 155, Integrate "Add Task" into table row.  In Scrum terms, implementing this story: "As someone looking at my list of things to do, I want to add a new thing directly to the list, so that I don't forget my new thing or lose my train of thought."  I've already put 11 hours of programming into this project, so I have some stuff I can build on top of.


You can see six tasks in the list, and then I've already programmed another row with a few boxes, in which I've already typed the name of the next task, "buy some milk", and the estimate, "ten minutes".

So the user need is pretty clear: in abstract terms, add a new task quickly.  In concrete website vocabulary, add a new task by typing just the name and maybe the estimate, clicking a button, and then seeing the list updated with the new task in place.

How thoroughly am I solving this?  Well, I've already completed Task 129, Basic Tasks, which included features to create, read, update, and delete tasks.  So I'm hoping to re-use that work to make Task 155 very easy.  A thorough solution with very little new work.  How is this possible?

Entering data into a computer has a number of common complications: security, validation, error handling, and the like.  I've built the whole web program on top of another program called Django, which provides all of these features automatically.  I already defined all of my data for Task 129, so Django knows which fields are required, which fields have special rules ("Wait until" must be a date), and so forth.  This is called the data model.

The "Create" portion of Task 129 is basically "As a person with a task, I want to record everything I know about this task so that I can track and manage the task later."  In concrete web terms, this means a form.  So that's one web page this displays all of fields to be entered, some code to handle receiving and processing the input and a page to display any problems and let the user try to fix them, and something to show after the input is accepted.

With Django, most of that can be done thusly:
<form action="" method="post">
    {% csrf_token %}
    {{ form.as_p }}
    <input type="submit" value="Done">
</form>
which produces this form (shown partially filled out):


This code handles, or rather invokes much more Django code which handles, most of the complexity of full-featured web data entry.  There's a hidden token to prevent certain kinds of security breaches; each different type of data is being collected in the appropriate kind of web input field, from text to choices to paragraph boxes.  And validation and error handling


There are a few other bits of code here and there, to do things such as determine the URL of this page, make sure the user is logged in, and so forth.  But using the feature Django calls Class-Based ModelForms, I can write what would otherwise be hundreds of lines of detailed code in maybe a dozen or two lines.  So, I'll just re-use that for Task 155.

Or so I thought.  The problem with using a program someone else has created is that it may not solve your exact problem.  The best programs are flexible and handle many variations on the problem.  This code is really code at displaying a form on one page, and handling form submission.  But if you go back to the first screenshot, you can see that there is already a form on that page.  And, I don't want the whole form, I only want two fields; and I don't want them line after line, I want them fitted into one row on the table, matching the column headings.

Okay, so I can't use the command {{ form.as_p }} to display the form, so I'll have to write out the form display code myself.  Then I can have the form go to the form submission code, and if it works, come right back to this page with the new task inserted.  It will require a full page refresh, rather than magic web 2.0 stuff where the input just appears, but it should be a good first step.

More complications: {{ form.as_p }} is dynamic.  That is, every time somebody browses to a web page with that code, those four lines of code above turn into this HTML code (which you should scroll past without reading):
<form action="" method="post">
  <input type='hidden' name='csrfmiddlewaretoken' value='4U1ynhzn9GTRDxoynhYnw2JWfXsCqB6P' />
  <p>
    <label for="id_name">Name:</label> <input id="id_name" maxlength="500" name="name" type="text" />
  </p>
  <p>
    <label for="id_task_order_0">Order:</label>
    <ul id="id_task_order">
      <li>
        <label for="id_task_order_0">
      <input id="id_task_order_0" name="task_order" type="radio" value="-1" />
      top
    </label>
      </li>
      <li>
    <label for="id_task_order_1">
      <input checked="checked" id="id_task_order_1" name="task_order" type="radio" value="11" />
      bottom
    </label>
      </li>
    </ul>
  </p>
  <p>
    <label for="id_estimate">
      Estimate:
    </label>
    <input id="id_estimate" maxlength="50" name="estimate" type="text" />
  </p>
  <p>
    <label for="id_wait_until">
      Wait until:
    </label>
    <input id="id_wait_until" name="wait_until" type="text" />
  </p>
  <p>
    <label for="id_notes">
      Notes:
    </label>
    <textarea cols="40" id="id_notes" name="notes" rows="10">
    </textarea>
  </p>
  <p>
    <label for="id_tags">
      Tags:
    </label>
    <select multiple="multiple" id="id_tags" name="tags">
      <option value="2">
        work
      </option>   
      <option value="3">
        monster
      </option>  
      <option value="5">
        shopping
      </option>
    </select>
    <span class="helptext">
      Hold down "Control", or "Command" on a Mac, to select more than one.
    </span>
  </p>
  <input type="submit" value="Done">
</form>

I can manually write some of that to fit into the bottom row of my table, because the person who wrote the fancy table display program that I use to make my table look not totally ugly made a provision to put stuff into the bottom row of the table.  But I can't just put in the HTML code that I need, because there's a security measure in Django forms to prevent "csrf" attacks, and so Django needs to create a custom code for each form.  So I can't use the Django code that magically solves my problem because it doesn't solve my exact problem (the people who wrote the form program didn't provide for creating the HTML code as a row in a table), and I can't use the next more primitive level of abstraction, the HTML code, because I still need some functions from the Django code.

The quick and easy win-win solution to the user's problem just got slow and dirty, but I have to get something working so I hack out the HTML for the more limited form that I'll squeeze into the table row:
<tfoot>   <form action="create/" method="POST">     {% csrf_token %}     <tr>       <td colspan="2">         <div class="fieldWrapper">           <input id="id_name" maxlength="500" name="name" type="text" />         </div>       </td>       <td>         {{ add_task_form.task_order }}         </td>       <td><div class="fieldWrapper">           <input id="id_estimate" maxlength="50" name="estimate" type="text" />         </div>         <td colspan="6">           <input type="submit" value="Add new task"/>         </td>     </tr>     </form> </tfoot>
But, if I ever reword "estimate" to, let's say, "Estimated Time to Complete", I can make that change in one place, the data model, and it will automatically take effect everywhere that the Estimate field is used except for this hacked code.  By solving the user's problem more expediently, I've damaged the thoroughness of my solution.  And this code doesn't include the dynamic csrf token; I can either disable that security feature (harming the thoroughness) or go back to the drawing board (delaying the solution).

There's another problem.  When I planned out the Create function of Task 129, I had the problem of what to about the new task order.  This is a pretty typical sequence of events in programming: the user specifies a need, the programmer understands it and starts to program a solution, and runs into a complication, most often forced into view by the already-completed data model, and realizes that the need isn't specified thoroughly enough.  What that looks like here is, the data model requires that every task have a unique, non-null task order.  That is, every single task must always have a place in the stack ranking and no two tasks can ever by tied.  So what's the task order for a brand new task?  The user just asked that it be possible to create a new task.

At this point, a programmer either goes back to the user, who may not understand what's missing: "What's a task order?  I didn't order anything."  Or they may not be able to articulate what they want: "Make it so that the next task is next.  Not at the top or bottom, just next."  Or may not be able to understand why what they want is impossible: "Just leave it blank for now.  No, I don't want it to be tied with anything; just leave the order blank so it can be sorted later.  No, I don't know what relaxing the field integrity means but it sounds bad so don't do that.  Just figure out a way that it can be blank but still all tasks are in order, no ties."  Or the programmer just guesses what the user would want, or what might make the user happy.

In my case, since I'm the programmer and the user, I know what I want to try for now: all new tasks must go either first or last.  I have a suspicion that most people who maintain task lists are either all one or all the other: either you put your new tasks at the top and you are always fighting fires and never finishing things, or you put them at the bottom and basically it takes forever and a nag for anyone to get anything from you because you are always finishing something someone asked for a long time ago.  Of course it makes sense to be able to put tasks somewhere in the middle, and to be able to move them around, but I'll deal with those cases later.  And, if I've finished tasks ordered 1, 2, and 3, and am working on a task ordered 4, and add a new task at the top, should that task go before 1 or between 3 and 4?  All to be dealt with later; in order to finish Task 129 and be able to create tasks at all, I decided to allow only absolute top or absolute bottom.

And I had to write some code to do that.  What I did was, in the place in the program where Django prepares the form for display (i.e., renders the code form.as_p into dozens of lines of HTML as shown above), I found how the programmers who created Django intend for Django programmers to customize the process.  And I made it so that, at the moment the user clicks "New Task" and Django figures out the web page to show them, Django figures out what the high and low orders are, so that those two buttons are tied to the numbers.  That is, if you have four tasks, ordered 1 through 4, Django will make "top" equal to 0 and "bottom" equal to 5.  Of course, to do even that I had to figure out how to use Django's database code to get this information, something I already know how to do directly in SQL but I can't do that or I'll break Django's database-independence, which means that if I ever publish this code, other people won't be able to use it on the database of their choice.  As with the hacked HTML, I can do it quickly, or I can do it more thoroughly.  I spent the time to do it correctly, and learned some Django query syntax:

lowest_order = Task.objects.filter(user=user).order_by('-task_order')[0].task_order + 1 highest_order = Task.objects.filter(user=user).order_by('task_order')[0].task_order - 1
But when I display the hand-coded HTML form in the row of the task table, I skip these calculations.  In fact, I don't include the task order field at all.  So when the user clicks the Add New Task button in the row and the data is passed to the Django function that processes form input according to the data model, it rejects the input because I specified that task order can never be empty.  And instead of seeing the task list with the new task included at the bottom, the user sees the full form page for new tasks, with an error message that doesn't make much sense:


It's also hard to spot since I haven't applied any formatting to the raw HTML output, but that's a different Story.  And it says "this field" when "Order" would be clearer, but that's yet another Story.

So, in case you forgot what problem we were trying to solve and whether or not we've solved it, a recap: I am trying to allow the user to input a new task right in the task table.  To do this I tried to reuse functionality I had already programmed for adding new tasks via a form on its own page, but when I tried to squeeze that form into the bottom row of the table, the functionality I was re-using wasn't suitable and I had to hand-code some stuff.  And, I discovered that the user need itself was inconsistent: the user wants to create a new task without specifying its order, but elsewhere the user has insisted that all tasks be sequentially ordered, and the user hasn't specified how to reconcile this conflict.

So.  I could make a rule that all tasks added in this fashion go to the bottom of the task order.  After all, the row for adding tasks is at the bottom of the table.  But to do so and still reuse the form submission and processing code, which requires a task order to be provided, I would have to dynamically figure out what the bottom task order should be, and figure out how to insert that dynamic information into my static HTML.  And, this isn't even a good idea, because you have noticed that this approach introduces a kind of weakness called a race condition

If the user clicks "New Task", the system calculates top and bottom task orders and bakes those into the New Task form and displays that on a page.  Suppose the user does this on their computer and then gets distracted by a phone call and wanders away.  At the end of the phone call, they have another new task, so naturally they try to enter it on their phone.  When they browse to the site on the phone and click New Task, the system again generates top and bottom task orders.  They finish creating the task, then go back to their computer, remember the original task they were trying to create, type it in, and submit.  The original web page on the computer and the new web page on the phone each generated the same top and bottom numbers, so if they picked, let's say, "top" for both, when they click to save on the computer it will try to save the task to an order that's just been used, and it will fail with an error.  So that's not a very thorough solution.

So.  My attempt at quick and easy re-use is stymied, and I've discovered a flaw in my existing code: highest and lowest should be calculated at the last possible instant, not the first.  And, it occurs to me that using integers to store task order is a bad idea because each new task added in the middle will require changing the order number of half of the existing tasks.  I thought about using floating point numbers (so that new task could be 1.5 or 1.25 etc) but that might open up other worm cans; I should maybe research best practices on this.  In trying to solve a new user problem, I revealed weaknesses in an already "solved" user problem without making much forward progress.

There's lots more to say about the interplay between understanding user problems, trying to solve them, and deciding how thoroughly to solve them.  My solution to Task 129, the basic elements of adding a new task, turned out to be so incomplete that I couldn't use them for some things I was already pretty sure I'd be wanting to do.  But on the other hand, I'll still be able to salvage most of that solution, and re-use it for other problems, and it was a morale boost to get something working.  More to write on this in the future.  If you work with software developers, I hope this gave you a little insight into why everything seems like such a big deal.  Blogging this out helped me unstall myself:

I need to go back and re-solve 129 (by switching from early to late binding of task order, and maybe simplifying my solution so that all new tasks are always last, taking away the user choice.  I need to research changing the data model from integer to floating point task orders, by finding 2-3 examples of other systems and seeing what they do.  And then I need to bite the bullet for 155 and figure out how to build the initial form dynamically, which means figuring out how to get Django to do two forms on one page.  I should mention that I did try that already, and ran into the biggest productivity killer of all: I did what seemed like should work and it just didn't, and now I have to figure out why.  Which is the actual biggest element of programming: debugging.

No comments :

Post a Comment