Author Archives: Ian Clarke

5-star ratings are dangerous

…or “Why 99designs is sometimes telling people that the worst design is actually the best”

I’m a big fan of the website 99designs. For those unfamiliar with the site, you can pay for designers to compete with each-other to design your logo or website. I’ve used it for a number of my websites including, and we’re currently using it for a sorely needed redesign of

A critical part of 99designs is to provide as much feedback to designers as possible during the selection process. While this can be fun, it can also be quite laborious. Fortunately, 99designs allows you to enlist your friends and co-workers to help, by allowing the creation of polls.

The process is simple: you select the designs you’re most interested in, and create a poll. You’re then given a link which you can share. Anyone that clicks on the link can rate each design between 1 and 5 stars.

99designs then averages up the votes and tells you which design is best. Of course you don’t need to agree with the poll, but I’m sure it has a significant influence for most people.

I’ve never been a big fan of 5-star ratings. I first thought about them in-depth when Netflix offered a $1 million prize to see who could predict people’s 5 star rating most accurately. It was one of the first large-scale machine learning “contests”, and a precursor to hugely popular websites like Kaggle. The prize was won, but tellingly, Netflix never actually used the winning entry.

The main problem with 5-star ratings is how people often use the results. Many people, like 99designs, simply take the average rating, but this is very dangerous because it implies a bunch of assumptions about how people come up with ratings that almost certainly aren’t accurate.

For example, if someone rates two of the designs 2 stars, but doesn’t rate the other two at all, this could change the ordering — even though this person hasn’t conveyed any useful information about their preferences.

So when we recently ran a poll to select the final designer for Freenet’s website, I was fairly suspicious of what 99designs was telling me was the winning design, and was curious about whether a different more sensible approach to interpreting the results might yield a significantly different outcome.

So what is someone saying when the vote that design A is 2-stars, and design B is 3-stars? Really all they’re saying is that B is better than A. Humans are far better at making these relative judgements than absolute judgements.

So, if we take everyone’s vote, and assume that if someone assigns a higher rating to one design than another, that it only means they think the first design is better, then we can ask a better question:

What ordering would lead to the fewest contradictions with the preferences people have expressed?

99design’s average-based approach ordered the designs 120 (best), 10, 41, 114 (worst).

My approach came up with this ordering, where the top-most order — 10, 114, 41, 120, is the best with just 4 disagreements with how people rated.

A few things to note:

  • This new approach suggests that design 10 is the best, followed by design 114 (the worst design per 99design’s approach)
  • The first ordering where design 120 is considered the best has twice as many “disagreements” as the best order.
  • The specific ordering shown by 99designs has 11 disagreements — one of the three worst orders!

My approach isn’t perfect and can certainly be refined, but it does make me wonder how many people have picked the wrong design due to 99design’s current approach.

Here is the code, feel free to rip, mix, and burn.

The purpose of software project management

I recently read the article The sad graph of software death by Gregory Brown.

Brown describes a software project wherein tasks are being opened faster than they are being closed in the project’s task tracker.  The author describes this as “broken”, “sad”, and “wasteful.”  The assumption behind the article seems to be that there is something inherently bad about tasks being opened faster than they are being closed.

The author doesn’t explain why this is bad, and to me this article and the confused discussion it prompted on Reddit are symptomatic of the fact that most people don’t have a clear idea of the purpose of software project management.

Another symptom is that so many software projects run into problems, causing tension between engineering, product, and other parts of the company.  It is also the reason there is such a proliferation of tools that purport to help solve the problem of project management, but none of them do because they don’t start from a clear view of what exactly this problem is.

Two complimentary goals

In my view, the two core goals of project management are prioritization and predictability.

Prioritization ensures that at any given time, the project’s developers are working on the tasks with the highest ratio of value to effort

Predictability means accurately estimating what will get done and by when, and communicating that with the rest of the company.

A task tracker maintains a record of who is currently working on specific tasks, which tasks are completed, and the future tasks that could be tackled. As such, the trackers do not address the two core goals of project management directly.

I have actually thought about building a project management tool that addresses these goals, i.e. prioritization and predictability, much more directly than is currently the case with existing systems.  Unfortunately, to date the value to effort ratio hasn’t been high enough relative to other projects 🙂

When a task is created or “opened” in a task tracker, this simply means “here is something we may want to do at some point in the future.”

Opening a task isn’t, or shouldn’t be, an assertion that it must get done, or must get done by a specific time. Although this might imply that some tasks may never be finished, that’s ok. Besides, a row in a modern database table is very cheap indeed.

Therefore, the faster rate at which tasks are opened rather than closed is not an indication of a project’s impending demise; rather, it merely reflects the normal tendency of people to think of new tasks for the project faster than developers are able to complete those tasks.

Once created, tasks should then go through a prioritization or triage process; however, the output isn’t simply “yes, we’ll do it” or “no, we won’t.”  Rather, the output should be an estimate of the value provided to complete the task, as well as an estimate of the effort or resources required to complete it. Based on these two estimates, we can calculate the value/effort for the tasks.  It is only then that we can stack-rank the tasks.

Estimating value and effort

Of course, this makes it sound much simpler than it is.  Accurately estimating the value of a task is a difficult process that may require input from sales, product, marketing, and many other parts of a business.  Similarly, accurately estimating the effort required to complete a task can be challenging for even the most experienced engineer.

There are processes designed to help with these estimates.  Most of these processes, such as planning poker, rely on the wisdom of crowds.  These are steps toward the right direction.

I believe the ultimate solution to estimation will exploit the fact that people are much better at making relative, rather than absolute, estimates. For example, it is easier to guess that an elephant is 4 times heavier than a horse, than to estimate that the absolute weight of an elephant is 8000 pounds.

This was recently supported by a simple experiment that I conducted.  First, I asked a group to individually assign a number of these relative or comparative estimates.  Then, I used a constraint solver to turn these into absolute estimates.  The preliminary results are very promising.  This approach would almost certainly be part of any project management tool that I might build.

Once we have good estimates for value/effort, we can then prioritize the tasks.  Using our effort estimate, combined with an understanding of the resources available, we can come up with better time estimates.  This will enhance predictability that can be shared with the rest of the company.

Pivotal Tracker

I have had quite a bit of experience with Pivotal Tracker, which I would describe as the “least bad” project management tool. Pivotal Tracker doesn’t solve the prioritization problem, but it does attempt to help with the predictability problem.  Unfortunately, it does so in a way that is so simplistic as to make it almost useless.  Let me explain.

Pivotal Tracker assumes that for each task, you have assigned effort estimates which are in the form of “points” (you are responsible for defining what a point means).   It also assumes that you have correctly prioritized the tasks, which are then placed in the “backlog” in priority order.

Pivotal Tracker then monitors how many points are “delivered” within a given time period.  It then uses these points to project when future tasks will be completed.

The key problem with this tool is that it pretends that the backlog is static, i.e. that new tasks won’t be added to the backlog before tasks are prioritized. In reality, tasks are constantly being added to any active project, and these new tasks might go straight to the top of the priority list.

Nevertheless, the good news is that Pivotal Tracker could probably be improved to account for this addition of new tasks without much difficulty.  Perhaps a third party could make these improvements by using the Java library I created for integrating with PT’s API.   🙂

Breaking down tasks

Most tasks start out as being quite large, and need to be broken down into smaller tasks, both to make it easier to divide tasks among developers, but also to improve the accuracy of estimates.

However, there isn’t much point in breaking down tasks when nobody is going to start work on them for weeks or months.  For this reason, I advise setting time-horizon limits for task sizes.  For example, you might say that a task that is estimated to be started within three months can’t be larger than 2 man-weeks, and a task to be started within 1 month cannot be larger than 4 man-days.

As a task crosses each successive time-horizon, it may need to be broken into smaller tasks (each of which will, presumably, be small enough until they hit the next time horizon).  In practice this can be accomplished with a weekly meeting, that can be cancelled if there are no tasks to be broken down.  We would assign one developer to break down each oversized task and then the meeting would break up so that they could go and do that.  Typically each large task would be broken down into 3-5 smaller tasks.

This approach has the additional advantage that it spreads out the process of breaking down tasks over time and among developers.

Resource allocation

So how do you decide who works on what?  This is fairly simple under this approach.  Developers simply pick the highest priority task that they can work on (depending on skill set or interdependencies).

At OneSpot, when we broke down tasks, we left the subtasks in the same position in the priority stack as the larger task they replaced.  Since developers pull new tasks off the top of the priority list, this has the tendency to encourage as many people as possible to be working on related tasks at any given time, which minimizes the number of projects (large tasks) in-flight at any given time.


To conclude, without a clear view of the purpose of successful project management, it is not surprising that so many projects flounder with many project management tools failing to hit the mark. I hope I was able to provide the beginnings of a framework to think about project management in a goal-driven way.

Model Bureaucrats

Back when I was 15ish, my school sent me to a “Model United Nations” in Dublin. This was the beginning of my decades-long distrust for bureaucrats.

I don’t think they sent me because I exhibited any particular interest in politics, I think it was because I wouldn’t shut-up in class, so naturally I might be well-suited to such an environment. I was Mexico’s ambassador, a country I knew next to nothing about at the time. That year it was Mexico’s turn to be on the Security Council.  I noticed that most of the other students seemed to view the UN as some kind of all-powerful genie, all they had to do was vote for something and, lo-and-behold, it would happen.

So I took particular pleasure in disagreeing with whichever idea seemed to be popular with the rest of the group.

There was one guy from Chicago, he seemed to be a very experienced model-UN participant. I imagined him flying around the world by private jet, from model-UN to model-UN, part of some kind of American model-UN “dream team”.

During one debate he gave an eloquent and widely-applauded speech compared India and Pakistan to two fighting children, and thus proposing that we resolve their decades-long conflict by taking away their “toys”, their nuclear arsenals.

I pointed out that comparing two nuclear-armed nations to squabbling children might not be an entirely apt analogy, it might even be a little patronising, and there may be some practical difficulties in depriving them of these “toys”. I was unanimously outvoted, the United Nations would disarm India and Pakistan.

Needless to say, I didn’t make many friends. There was a daily newsletter, mostly contributed to other student-attendees. One of them took the time to anonymously write an entire poem about me – comparing me (unflatteringly) to Mr. Bean. “Ode to Mexico on the Security Council”. Unfortunately the poem itself is lost to history.

Arguments you can make as a startup CTO against filing for software patents

Most software engineers believe that software patents are bad for innovation, and shouldn’t exist. Unfortunately, this isn’t a good argument against filing for software patents, since it’s essentially a complaint about the rules of the game, but as a startup CTO you still have to play the game, regardless of what you think of the rules.

Many CTOs of seed or A-round companies find themselves under pressure from non-technical co-founders and investors to spend time and money on software patents. I don’t always win the debate, and have been involved in filing a number of patents, however I can say with confidence that the return on investment on all of this time and effort was precisely $0 in every case.

Hopefully others can also share their advice or experiences.

Most people who advocate for patents within a company haven’t really thought it through in depth, they’re just advocating it because they think it’s what you need to do. Often they’ll ask a patent lawyer for advice, and they’re no more likely to argue against filing for patents than a turkey is going to argue in favor of thanksgiving dinner.

A few arguments against startups filing for patents:

  • They cost $10-$15k a pop, and can soak up a significant amount of your CTO’s time, often at a critical early phase of the company.
  • You don’t even get them for several years, by which time your startup has probably either succeeded or failed, and these patents will have played no part in it either way (except perhaps for the previous point where they probably hurt you by sapping money and time).
  • US jurisprudence has grown more restrictive about what can be patented over the past few years. What was clearly patentable 3 years ago is clearly unpatentable today. There is a pretty good chance that your patents will be rejected. In particular, business method patents (which patent lawyers used to love) are no-longer being granted.
  • “Defensive” patents don’t protect you from patent trolls (aka non-practicing entities), since they are immune to a counter-suit because they don’t actually do anything other than litigate or threaten to litigate.
  • Even if you don’t have an ethical problem with patents covering a real innovation, these are not the kind of patents most patent lawyers will want you to file for. Firstly because they will require a lot more work to create (and you’re only paying them $10-15k). And secondly, because they tend to be specific enough that they’re less useful as an offensive weapon.
  • Even if the patents do cover a real innovation, if it’s granted you’ve now disclosed something valuable to your competitors. They may not be able to copy it directly, but as you know there are typically many ways to solve any given problem in computer science, and now they know how you solved it.
  • Even if they’re granted, their only real value will be as part of a firesale in the event of your company’s failure, where they’ll probably be snapped up by a patent troll for less than you paid to file for the patent in the first place.

When people ask about my “IP defensibility” strategy, I generally argue for trade secrets. They’re free, require no effort, you’re not disclosing potentially important information to competitors, and they basically lack any of the other shortcomings I describe above.

Add a command to list git branches in order of last commit

This is based on this SO answer.

Wouldn’t it be useful if you could order git branches in order of the most recently used, so the ones you are likely to be most interested in are at the top? Here is how, just type:

$ git config --global alias.branches 'for-each-ref --sort=-committerdate refs/heads/ --format=\'%(committerdate:short) %09%(authorname) %09%(refname:short)\''
Now, just type:
$ git branches

And you’ll get something like:

2013-09-21 Ian Clarke gh-pages
2013-09-15 Ian Clarke master
2013-07-14 Ravi Tejasvi contactBook
2013-06-15 Ravi Tejasvi android
2013-06-08 Ian Clarke web-look-and-feel
2013-03-23 Ian Clarke cleanup-topology-maint
2012-06-12 Kieran Donegan topologyMaintenance
2012-05-28 Ian Clarke vaadin
2012-04-27 Ian Clarke refactor-of-peer-info
2011-07-07 Ian Clarke tr-remote-address-everything-needed

Note the date of the last commit, the committer, and the branch name.

Tungle: A wasted opportunity

Apparently Tungle has shut down development, although they still allow people to sign up. Turns out their acquisition by RIM last year must have been an acquahire (technically an acquisition, but really an admission of defeat).

Tungle had an incredibly viral business model, perhaps the most viral I’ve seen since Plaxo, solving a problem I and many others encounter on a near-daily basis:  Help people schedule meetings and calls with each-other.

So what went wrong? Their usability SUCKED. I desperately wanted Tungle to work, but almost every time I tried using it to schedule a meeting with someone, something would screw up and we’d have to resort to manually scheduling via email.  This was embarrassing when it happened, but even so I tried over and over again.  Every time I did could have been an opportunity for Tungle to sign up a new user, if their usability wasn’t so bad.

So if there is anyone out there looking for an idea that could be the next LinkedIn-scale viral phenomenon, all you have to do is reimplement Tungle, but this time get the usability right.  If I weren’t already rather busy I’d be doing this myself.

Microsoft probably just killed “Do Not Track”

Update (27th Oct 2012): I told you so!  Yahoo will ignore DNT from IE10 for exactly the reason I cite below.

Microsoft just announced that the “do not track” opt-out would be on by default in Internet Explorer 10.  This is a boneheaded move.

“Do not track” is a standard through which a web browser can inform a web page that the user does not wish to be tracked by third-party websites for the purpose of advertising.  So far as I can tell, respecting this is entirely voluntary on the part of the advertisers.

Advertisers often use browser cookies to track users, this allows them to target advertising specifically to people who’ve visited their website, for example.  Google and Microsoft both do it, it’s fairly standard practice these days.  Typically the advertiser isn’t tracking you as an individual, all they know is that you may have previously visited a particular website.

To explain why Microsoft’s move is boneheaded, I’ll relate a story from the early days of Revver, the online video sharing website that I co-founded back in 2004.

We had decided to let video uploaders tell us whether the video contained any content that is not appropriate for children as part of the upload process.  The vast majority of our users did exactly this and all was well, until at some point we realized that people were uploading some pretty serious pornography that we weren’t comfortable with even if it was marked as “adult” by the uploader.

Our panicked solution was to simply remove all videos marked as “adult” from the site, and prevent any further uploads where the videos were so-marked.

Of course you can predict the result: people immediately stopped marking videos as “adult”, making our task vastly more difficult.

The moral?  Don’t expect people to do something voluntarily if you are then going to use it against them.

I think Microsoft has just made exactly the same mistake.  Previously I think there was a reasonable chance that advertisers would choose to respect this, since only a minority of users are likely to enable it, and those are the people that really care about not being tracked.

But if it is enabled by default in Internet Explorer 10, advertisers now have no idea whether the user really cares about being tracked, and as a result they are far less likely to respect it.

Looking at it a different way, Microsoft just gave advertisers the perfect excuse to ignore DNT, because they can correctly claim that in most instances the user will have made no conscious decision to enable it.

Object Relational Mappers (ORMs) are a terrible idea

ORMs (like Hibernate in Java) are a flawed solution to an imagined problem that should be consigned to the wastebasket of history.

They are predicated on the idea that relational databases are old fashioned and should be avoided at all costs, but if you can’t avoid them, use some kind of convoluted wrapper that tries (and generally fails) to pretend that the relational database is something it isn’t – an object oriented datastore.

Often what people are really looking for when they use an ORM is some kind of database abstraction layer.  It is reasonable to abstract the database somehow, but I recommend not using an abstraction layer that seeks to pretend the database is something that it isn’t.

In Java, the best such database abstraction layer I’ve found is Jooq, I highly recommend it.

LastCalc: A powerful calculator meets Quora meets Siri

For the past month or so my main spare-time project has been a crazy idea called LastCalc (link at bottom).

I’ve been having trouble figuring out how to describe it, but here goes: Imagine a powerful web-based calculator that can answer your questions, a little like Google Calculator, Siri, or Wolfram Alpha, but where anyone can teach it how to calculate new things.

Additionally, rather than asking one question, you can ask a series of questions, each potentially referring to previous answers (programmers know this is a Read-Eval-Print-Loop or REPL).

Just like the others it supports basic math and unit conversions, like this (note: the highlighting is automatic and happens as you type – you type the bit before the big silver = and hit return, the answer appears after it):

But it goes a lot further. You can assign the result of a calculation to a variable, and then use it in subsequent calculations:

Internally LastCalc treats all numbers as rationals (x/y where x and y are integers) if possible, even if they are displayed as floating point numbers.  This means that it will not lose precision regardless of how many calculations you do (this can be a problem if using normal floating point numbers which are imprecise).

It’s not just simple numbers, LastCalc understands lists and associative arrays too, using a syntax very similar to JSON:

LastCalc is extensible, so if you find yourself repeating the same calculation over and over again, you can teach LastCalc how to do it (note: parameters are denoted by capitalization, like Prolog):

And it goes further, supporting pattern matching and recursion using these datastructures, just like languages like ML and Haskell:

Then use it with:

You can also pattern-match on maps.  Here I define a function that takes a map and returns a list of its keys:

Currently I’m working on a tutorial and help system so I don’t need to explain all of this before sending people to the site 🙂

Right now you can only use functions that you define yourself, but in due course people will be able to share functions, much like they can share answers to questions with Quora.

So far it has only been tested in Chrome and Safari, and it definitely doesn’t work yet in Internet Explorer.  I’m waiting for the Javascript to stabilize before climbing that particular mountain.

Check it out at

It’s obviously a work in progress, if you’d like to follow discussion and provide me with feedback please join the LastCalc Google Group, or follow @LastCalc on Twitter.