Category Archives: Programming

The game theory of sealed-bid auctions

According to Wikipedia, a “frenemy” is someone who is simultaneously a partner and a competitor.

The term is typically used in social relationships, but frenemies can also exist in economic relationships, and I was fortunate enough to encounter such a situation in a project I’m working on, and I decided to explore it. The conclusions are directly applicable to at least one specific situation where real money is at stake (I describe this at the end). More interestingly, this may suggest a selfish rationale for unselfish behavior in a wider class of situations.

First, a little background:

Sealed Bid Auction

Consider a simplified version of eBay where everyone bids once on an item, nobody sees each-other’s bid, and the highest bid wins. This is called a “First-price sealed-bid auction”.

One day you find a trustworthy guy called Bill that promises that they can pay you $200 for a Nexus One phone. You discover that Nexus One phones frequently sell for less than $200 on eBay, and several of these phones are auctioned off every day. Bill hates to use eBay and won’t ever use it no-matter what, and he doesn’t really care what you bid for the phones, so long as he gets them. You realize that there is an opportunity to make some money here.

So what do you bid when you see one of these phones? Its a compromise, since if you bid $200 you’ll definitely win the phone, but you’ll make no money. If you bid less, then you’ll make more money if you win, but you’ll win less frequently. If you bid too little then you’ll never win and you’ll make no money.

This is not a hard question to answer if you are reasonably smart and have a decent amount of information about past winning bids, since you’ll be able to discover, given a bid b1, a function f1(b1) that will tell you the probability of winning the auction, given whatever you bid is. Let’s assume that you are smart enough and you do have enough information. Let’s say your cut of the profit is c1, then, applying some high school math, your expected profit is $200*c1*f1($200*(1-c1)).

From this equation you are able to decide the optimal bid (and therefore how big your cut is) to maximize your own profit. You can either do this through some fancy math, or just experimentally plug in different values for c1 until you find what works best, the latter being my preferred approach (because I’m lazy and computers aren’t).

Bill gets clever

Then Bill comes to you and tells you that he may not always be able to pay you $200 for the phone, sometimes he’ll pay more, sometimes he’ll pay less, but he will tell you what he will pay before you must place your bid. Fair enough you think, you’ve got your function f1(b1), you can determine the optimal bid depending on whatever Bill is willing to pay.

Weeks go by and you are making good money off this relationship. One night you go out for a beer with Bill, and he drops a bombshell. He tells you that actually, he is playing the same game you are. He knows a guy, Jim, who is buying the phones from him. He won’t tell you who Jim is (Bill isn’t an idiot), but it turns out that Jim is paying even more for these Nexus Ones than Bill is! Worse still, it turns out that not only is Bill no idiot, he is at least as smart as you are. He has determined the probability of you winning your bid, and is choosing his cut to optimize his total profit, in the exact same way that you are (although his function, f2(b2), won’t be the same as your’s because you are reducing his bid by your cut).

You go home drunk and don’t think about it much, but the next day you have a real headache, and its not just all the beer you had with Bill last night. What exactly is the relationship between you and Bill? In one sense, you are on the same side. If your cut combined with Bill’s cut is too big, then both of you will make less money. But in another sense, you are on opposite sides, the smaller his cut, the bigger your cut can be, and vice versa.

Given this relationship, and assuming that you can’t deal directly with Jim, Bill can’t deal directly with eBay, and Bill is at least as smart as you are, how do you maximize your profit?

My Simulation

A mathematician or game theorist at this point would probably go into a dark room for a week, month, or year, and come out with the mathematically optimal answer. Since I don’t have the time or patience required for a rigorous mathematical treatment, I decided to do a few experiments instead. You can find the code here.

The Auction

In the results described here, Bill’s buyer pays $1 (ok, we’re somewhat abandoning the Nexus One analogy here) to Bill for a phone. I wanted to create a reasonably realistic auction, which I do in the bidWinProb() method on line 30. Basically I simulate 100,000 auctions, each consisting of 10 bidders whose bids are spread according to a Gaussian distribution with mean of $0.50, and standard distribution of 0.1. I then record all of the winning bids, which allows me to quickly estimate a probability of winning for any given bid.

Yes, I realize that I could do this much more efficiently mathematically, but speed doesn’t really matter here, and this solution is simple enough that bugs are unlikely.

It has the added benefit that you can just provide it with actual data from a real auction, rather than simulated data.

Optimize Cut

The first step is that we want to be able to determine the optimal cut for ourselves, given whatever cut Bill has, or the optimal cut for Bill given whatever our cut is. This is done in the optimizeCut() method on line 78. The approach is basic trial and error. While a more sophisticated, accurate, and efficient approach is certainly possible, the current implementation is more than adequate for our needs (not to mention being easier to be confident that it does what its supposed to).

Experiment #1: Best combined cut

So what happens if Bill and I were entirely transparent with each-other, and agreed that we’d find the best cut for both of us, and then split it down the middle?

In this situation it turns out that Bill and I should collectively take $0.28 of the dollar Bill is paid if the auction is won (which when we consider the possibility of losing the auction, is an expected revenue of $0.24). Split down the middle this is an expected revenue of $0.12 each.

Experiment #2: Iterated selfish cut

But what if Bill and I can’t agree on an even split, and we are basically each left to our own devices to make as much as we each can?

One approach to do this would be for me to optimize my cut based on whatever Bill is paying me (which will depend on his cut), and for Bill to optimize his cut based on his perspective of his probability given various payments to me (which will depend on my cut).

When I tried this the results were interesting. I started from a point where Bill isn’t taking any cut at all, then each of us take turns to optimize our cuts based on the other one:

% Cut          $ Cut
Bill    Me      Bill    Me      Ttl $   1/2
0.000   0.280   0.000   0.244   0.244   0.122
0.100   0.220   0.080   0.159   0.239   0.120
0.140   0.190   0.109   0.128   0.237   0.118
0.160   0.180   0.119   0.112   0.231   0.115
0.160   0.180   0.119   0.112   0.231   0.115

This is interesting because they converge to where Bill takes a 16% cut, and I take an 18% cut. Further, when you look at the expected revenue, it is close to evenly split between us, but not exactly. Note that after this iterative process Bill and I collectively are making only $0.231, whereas we would be making $0.244 if we had worked together and split the difference.

Bill makes 3% more money if we work together and split the profit, and I make 9% more!

Experiment #3: Machiavellianism

My last two experiments I call “Evil Me” and “Evil Bill”. In “Evil Me” I exploit the fact that Bill will optimize his cut based on my cut, so rather than selecting my cut based on Bill’s cut, I select my cut to maximize my profit knowing that Bill will optimize for whatever is in his best interests given my cut. Evil Bill is the same but the other way around.

% Cut          $ Cut
Bill    Me      Bill    Me      Ttl $   1/2
0.100   0.270   0.055   0.134   0.189   0.094  <-- I'm evil
0.240   0.120   0.151   0.057   0.208   0.104  <-- Bill is evil

Wow, so in this situation being Machiavellian allows me to make $0.134 while poor Bill only makes $0.055! Bill reverses this if he is Machiavellian and I am not.

What is fascinating is that, while I increase my profit by 20%, Bill's drops to less than half of what it was before! One of us being Machiavellian helps that person a bit, but it hurts the other a lot.

Conclusions

So what are our conclusions based on this one experiment?

  1. If our sole motivation is to maximize our own profit, we should adopt the Machiavellian approach, even though this will totally screw Bill.
  2. If we at least have some semblance of compassion for Bill, the next best approach is to pre-agree a split with Bill, and then optimize together, as in Experiment #1.
  3. The pre-agreed split should probably be 50:50, because if Bill and I can't agree on a split and were to optimize against each other as in Experiment #2, then we wind up making approximately the same amount anyway, but we both make less than if the split is pre-agreed.

An important caveat is that these conclusions are based on a single experiment that makes a variety of assumptions. These assumptions seem realistic enough that hopefully these results do generalize, but we can't say that for sure.

Unanswered questions

Some may feel dissatisfied by the fact that I've answered these questions experimentally, rather than explore the math behind why we get these results. I would strongly encourage such people to feel free to explore themselves, especially if they enjoy algebra and statistics. I'll gladly link to any serious attempts here.

Further, I've explored a situation where one participant is manipulative, while the other tries to make the best of whatever situation they are in. But what if both participants are manipulative? My suspicion is that in that case it degenerates into a Game of Chicken with an ultimate outcome that is far worse for both participants, but further investigation is definitely warranted.

Why is this problem relevant?

Realtime bidding is a recent innovation in advertising where advertisers bid against each-other for the right to show you an advertisement. The analogy with the situation described above is that the sealed-bid eBay is an ad exchange, I'm an ad network, and Bill is the advertiser. Bill is selling something where he has a (hopefully) known profit margin, and he must decide how much of this profit margin he will spend to sell his product. Similarly, I as the ad network must decide on my cut.

Note 26th July 2010 - This is a republish of an article from 24th July with significant additions and modifications.

A great tip for diagnosing Java memory issues

My good friend (and creator of Apache Wicket) Jonathan Locke just gave me a great tip.  I was aware of the jmap utility, but didn’t know it could do this:

Just type:

jmap -histo:live <pid>

Where <pid> is the process id of a Java process, and it will dump out all objects, starting with whichever is taking up the most memory.

The handy thing is that the Java process doesn’t need to have been started with any kind of special command line options, it will work with any running Java process.

I wish I knew this a few weeks ago while debugging a memory leak on a remote server…

The Guardian writes about Freenet

A few weeks ago I spoke to Andy Beckett of The Guardian in the UK, he was writing an article about “the darknet”, and wanted to talk to me about Freenet.  I was quite pleased about this because when I’m in the UK the Guardian is by far my favorite newspaper.

We had a good conversation, I talked about the original motivation behind Freenet (read about it here), our challenges, like balancing the very theoretical issues we face with the need to write software that non-techies can use, and other things.

At one point he mentioned the dangers of “bad” uses of Freenet, but he did so in a very coy almost apologetic way.  I immediately assumed he was thinking of child pornography, and made it pretty clear that I was more than happy to discuss that, indeed of the hundreds of conversations I’ve had with journalists about Freenet over the past 10 years, I can’t think of one where I didn’t discuss the CP issue!

I told him what I tell everyone, which is that like most people I wish CP didn’t exist, but there are many ways to get it other than Freenet, and I don’t think people should be denied the freedom to communicate just because a small minority might use it for something we don’t agree with.

My impression at the time was that he really wanted to get off the subject, which I thought was a bit surprising since journalists love controversy, and there are few things as controversial as child porn.

Anyway, Google Alerts just told me that the article has appeared on the Guardian’s website (skip to the bottom for the link).

Overall I think it was quite a good article, I was particularly excited to read:

Installing the software takes barely a couple of minutes and requires minimal computer skills

I think Andy may be the first mainstream journalist in the history of Freenet to actually install and use the software before writing about it!

I was a little surprised that it did focus almost exclusively on the negatives implications of an absolutist “free communication” philosophy.  For example the subtitle is:

In the ‘deep web’, Freenet software allows users complete anonymity as they share viruses, criminal contacts and child pornography

I was neither surprised nor annoyed that a journalist would want to talk about computer viruses, criminals, and child porn, after all – controversy attracts clicks and lets face it, our newspapers need all the help they can get attracting revenue these days.  Further, it is perfectly legitimate to talk about the fact that freedom of communication implies freedom for people to communicate data and ideas we don’t like.

I was a little surprised, however, that Andy didn’t seem terribly interested in discussing these issues when we spoke by phone – even though I made it very clear that I was happy to and I even tried to steer the conversation in that direction.

Regardless of this slight surprise, I haven’t yet noticed any major errors in it, at least no errors on Andy’s part (although the [sic] after the American spelling of “pedophile” is not exactly in the spirit of cross-Atlantic harmony).  I am curious about how he knew I was a pasty teenager (although he was quite right)!  Not so much now that I’ve been living in Texas for a few years.

I did find this interesting though:

According to the police, for criminal users of services such as Freenet, the end is coming anyway. The PCeU spokesman says, “The anonymity things, there are ways to get round them, and we do get round them.

If, by “the anonymity things” they are referring to stuff like simple referrers, or even just using your browser in “incognito” mode, then they are correct.  But if they are referring to technologies like Freenet and Tor – then either they are mistaken, lying, or they know something about Freenet and Tor that neither I, nor anyone I’ve ever met or heard from, has discovered.  This is extremely unlikely, I know many of the best security people in the world, and none of them work for the British police.

This isn’t to say that identifying a user of software like Freenet or Tor is impossible, but it would require either an impractical expenditure of resources (bugging computers, etc), or for the user to do something like accidentally disclosing their identity on Freenet.  We can guard against many things, but we can’t guard against stupidity :-)

Anyway, as I said its a pretty good article, and despite its focus on the negative – it will hopefully bring some new users to Freenet.  Read it here: The dark side of the internet

Polynomial regression on a large dataset in Java

For an internal project at my day job I needed to find the n degree polynomial that best fits a (potentially very large) dataset.

I asked a question on Reddit about this, and Paul Lutus was kind enough to respond with a link to some code that did something close to what I needed, however Paul’s code was not well suited to very large amounts of data.

So with his help, I modified his code to decouple it from the original application it was a part of, make it follow good Java coding practices a bit more closely, and in doing so make it more flexible and well suited to handling very large datasets.

The code is under the GNU Public License, which is unfortunate since it’s a library and the GPL is viral and Paul is unwilling to relicense under the LGPL, meaning that I or someone else will need to re-implement under a different license if the library is to be used within non-GPL’d code :-(

Here is the source, collaborative free software at work, and please comment if you find it useful!: PolynomialFitter.java

Simple bash script to name a tab in iTerm

I like to open a lot of tabs in iTerm, the open source replacement for Apple’s Terminal.app on the Mac. Unfortunately, its easy to forget which is which, as typically they don’t get very descriptive names.

In particular, I have a few scripts I run which will automatically ssh into a remote server, and run screen. When I wanted to find one of these, I’d often have to click several tabs to see which one I wanted (since they would all, rather unhelpfully, be called “ssh”).

So I wrote a simple script, which I call “nametab”, which allows you to name the tab you are in from the command line. You just type something like:

$ nametab New tab name

If you’d like to use this yourself, here is the code:

Just put it in a file in your PATH, and ensure that it is executable (ie. do a chmod u+x nametab).

A good pretty-printer for Google Gson

I decided to do something about the one thing I wasn’t happy about in Google Gson, the lack of good pretty-printing.

I created a class called GsonPrettyPrinter which outputs nice readable JSON in a reasonably intelligent way, you can grab the source code here.

Note, in particular:

  • Requires at least Gson 1.4 (the current version at the time of writing)
  • It will try to represent JSON objects on arrays on a single line if it can
  • It sorts JSON objects by their keys
  • The code is released under the Lesser Gnu Public License.

Which is the best Java JSON library?

I may not have tried them all, but I’ve tried quite a few including:

  • The default libraries from json.org
  • Jackson
  • XStream
  • Google Gson
  • JsonMarshaller
  • JSON.simple

I ended up going with Google Gson, the library is simple, efficient, intuitive and flexible.  It allows you to convert JSON objects directly into equivalent Java POJOs[1] and back again, which makes for clean and simple code.  

One annoyance with Gson is that its JSON pretty-printer isn’t verbose enough for my tastes, but at the time of writing (Oct 09) this is on their todo list. Edit (Sep 2011): I created my own JSON pretty-printer that works with Gson – see here for more information.

[1] “Plain Old Java Object”

Giving talk about Swarm in Seattle next month!

I’ve been asked to give a keynote at IEEE P2P’09 in Seattle in September, and I’ve decided to talk about Swarm. Swarm is my proposal for a new way to build applications that allow you to scale them across multiple servers in a way that is almost transparent to the programmer.

In preparation I’m currently working hard to develop a prototype that makes use of the upcoming delimited continuations support in Scala 2.8. My talk will be at 9:30am on Wednesday 9th September.

You can read more about Swarm here.