top of page

Transrational game theory

“Any sufficiently advanced technology is indistinguishable from magic.” – Arthur C. Clark

Fun Theory Optimal

During my freshman year of college I was introduced to a classmate named John, who turned out to be one of the most brilliant mathematical minds on the planet. He now runs one of the top performing hedge funds of all time. 

When I met John, he was studying to break the world record for calculating square roots of large numbers in his head. He told me it wasn’t that hard, once you memorized all the logarithm tables. Then you can just “look up” the answer in your mind’s eye. So it was with some trepidation that I invited him to our dorm room poker game.

Why did the tree start dancing? It found its log-a-rhythm!

John used his brilliance to calculate mathematically optimal moves in his head for each hand he played. This style of play is called game theory optimal, and it’s unbeatable if played by a computer, with no emotions or other tells. A “tell” in poker is when you unconsciously give your opponents information about the strength of your hand and/or what actions you will take in the future.

While John was doing his calculations, the rest of us were watching his eyes, facial expressions and body language. Most importantly, we noticed how long each calculation was taking before he made his move.

Although we couldn’t do the calculations ourselves, we knew that if he took a long time it meant that it was a close decision. So if he made a bet, we realized he was not that confident and we could easily bluff him off the hand. Conversely, if he acted quickly, we knew he knew where he stood: if he checked, we would bet; if he bet, we would fold.

That same poker game — with many of the same players from college — still runs today, 35 years later. John didn’t last through that first year though. Had he stuck around a few more years he would have met another math whiz, Phil Gordon, who discovered a hack for doing odds calculations that allow anyone to do seemingly magical feats. And unlike John’s overwhelming log tables, Phil Gordon’s Rule of 4 & 2 requires no memorization — only simple math we all learned in 3rd grade.

Golden balls!

Imagine you find yourself on a game show where you and your fellow contestant have an opportunity to win a combined pot of $13,600. The game works like this: each contestant writes down secretly whether they want to split the pot, or steal the whole pot for themself. You then will reveal what you wrote down for everyone in the audience to see. The host explains:

“If you both choose to Split, then indeed you will each walk away with $6,800.

If you both choose to Steal, then you each walk away with nothing.

But if one of you Splits and the other Steals, the thief wins the whole pot, and the sucker goes home with nothing.

I want you to consider what you will choose for a moment before we go on.”

Being an Ivy League graduate in Game Theory you quickly calculate that your expected value of playing this game is $6,800 regardless of what move you choose. You could literally flip a coin and that would be game theory optimal (GTO) strategy.

Being really sharp, you notice the host has a British accent, which increases the likelihood that he’s an evil genius. As your Western Civ professor had explained, a lah-BOHR-atory is a laboratory used for evil purposes. Sure enough, your read of the host is spot on:

“Before you make your final selection, I’ll give you a few moments to discuss the matter. Maybe you’ll come to some kind of agreement.”

FUCK!!! That ruins everything, you realize instantly. 

The coin flip GTO strategy works only when you have zero knowledge of your opponent. 

Once you have a sense of their bias — whether they are more likely to Split or to Steal — then you have to factor that into your analysis. 

If they are more likely to Split, then your best move is to Steal and win the whole $13,600. That’s DOUBLE your expected value in the “zero knowledge” scenario.

If you suspect they are more likely to Steal, then you realize you can’t improve your expected value by Stealing, since you both get nothing in that scenario. So you calculate your best choice is to Split and hope to get lucky by being wrong. That really messes with your head because you hate being wrong, and you hate depending on luck.

The other player shakes you out of your GTO reverie by making the first volley.

“I’m a really nice person,” she says, “I always cooperate, and I want to split the pot with you. I’m going to choose Split.”

Your head starts spinning as you try to figure out if she’s sincere, or whether she’s just bluffing you so she can get you to Split and then choose Steal and win the whole pot. Maybe she went to a better Ivy League school and majored in Behavioral Psychology or Management Sciences?

Just as you are descending into a pit of despair, rueing the day that you chose to go on this stupid gameshow, you remember that this is a thought experiment I created and you are not on the gameshow!

Then I reveal to you that the gameshow really does exist and it has a British host and the $13,600 up for grabs is really £13,600 and this 6 minute clip of what actually happened is the most phenomenal triumph of humanity over rationality I’ve ever seen:

At first blush it appears that Nick played the ultimate rationalist strategy by biasing Ibrahim away from Stealing. As my analysis above shows though, the ultimate rational move for Nick would have been to follow through with his ultimatum to Steal, but then to reneg on his promise to split the money after the show. That would have gotten him twice as much money as he ended up with.

Who could have faulted him for such gamesmanship? After all it’s a game! A game based on lying and betrayal. The contestants know going in what the rules are. They are invited on the show for free.

In fact, this show is not about rationality or game theory at all. It’s about trust, and it’s about care for complete strangers.

Ibrahim had to trust Nick that he would split the money after the show, otherwise the most rational move for Ibrahim was to choose Steal and hope that Nick was lying and would choose Split, thus netting Ibrahim the full pot.

Nick had to trust Ibrahim to choose Split — a man he just met, who was in the way of Nick winning a lot of money. Nick had played Ibrahim perfectly by convincing him to Split; now all Nick had to do to assure himself of at least half the pot was to follow through and choose Steal himself.

Once Nick had secured the whole pot with his clever gambit, he could have total control of his destiny — even if Nick’s true intention was to follow through with his promise to split the money after the show.

Instead, Nick surrendered control and put his fate in Ibrahim’s hands with the surprising twist at the end.

The Rationalist’s Dilemma

In 2006 I found myself at a final table of the World Series of Poker event. There were four of us remaining and here was the prize pool up for grabs:

2006 WSOP

Pot-Limit Holdem Payouts

1st Place: $346,000

2nd Place: $181,000

3rd Place: $105,000

4th Place: $75,000

Each of us was guaranteed to win at least $75,000. If one of us were to be eliminated, the remaining three would net an additional $30,000 guaranteed (i.e. the difference between 3rd and 4th place).

It’s at this point in a tournament that you have to be really careful about which hands you choose to get involved in. If you see that an opponent has a chance to be eliminated by another opponent, the GTO move is almost always to stay out of the hand. Why risk being eliminated yourself when you can pick up an extra $30,000 risk-free?

However, if you can trust your opponent to not betray you, there’s a better strategy than GTO. It’s called “implicit collusion.” Implicit collusion is where two or more players go into a pot against an all-in player, dramatically increasing the odds of eliminating the all-in player.

I found myself in this situation, where I had to choose whether to risk getting involved in a pot to eliminate the shortest stack (George Bronstein), or sit safely on the sidelines while Eric Lynch did the dirty work.

The key to making implicit collusion work is a tacit agreement between rivals (me and Eric in this case) to play passively so as not pressure one another to fold. If either of us beats George we both gain an additional $30,000 in guaranteed prize money.

But how can we trust each other to not defect from our tacit agreement? After all, Eric and I both have strong incentive to eliminate one another and guarantee ourselves a lot more than $30,000.

Unlike Golden Balls participants, professional poker players are ruthless competitors — under no illusion that they are meant to cooperate with one another.

To make matters worse, it’s against the rules to explicitly cooperate or even discuss how we are going to play our respective hands while the game is in progress.

To make implicit collusion even more treacherous, if either of us decides to defect from the cooperative strategy, the other is worse off than if they had defected first. So there is a strong incentive for each of us to betray the other at every point in the hand.

For the analysis that follows, I recommend watching the clip above and pausing at each decision point.

As we pick up the action, George is first to act and decides to raise with his A2, a very good hand with only four players remaining. Since this is Pot Limit holdem he cannot raise all-in yet. However, everyone knows George is pot-committed and will be going all-in in the next betting round. 

Eric immediately deviates from GTO strategy by flat calling George’s raise, rather than folding or raising George’s bet. A purely rational player would be worried that the player in the big blind (me) would try to isolate heads up against George by raising and forcing Eric to fold. Thus, Eric is creating a very clear signal that he wants to team up with me to eliminate George.

Now I look down at my cards to see 97, which under normal circumstances is a clear fold. But since I noticed Eric’s signal that he wants to collude with me to eliminate George, I figure it’s in my best interest to play along.

But what if Eric is planning on double-crossing me? Eric has triple the amount of chips that I do, and once I make the call, I would be foolish to fold thereafter. In other words, I’m committing myself to risk everything on this hand, but desperately hoping to not have to.

I need to have faith that Eric is going to continue to play cooperatively with me, even if he has a very strong hand. If I’ve misplaced my trust though, not only am I risking the risk-free $30,000 but I’m also risking the $346,000 and championship title that comes with first place.

So how does one get to a place of trust like this in the heat of battle in a cutthroat game like poker?

Before the final table began, I reached out to my friends to do research and give me a dossier on all of my opponents. What came back on Eric is that he’s a top online tournament pro. This made it clear to me that he knows not only GTO strategy, but also that he must understand the power of implicit collusion to increase his tournament equity. I assessed that he wouldn’t fall prey to fear or greed easily.

I also found out that he was a family man from middle America, who was likely also a devout Christian. His online handle was Risen.

Watching Eric play for the last eight hours or so confirmed my read on his playing tendencies, as well as his character and conviction. I’d seen him look heavenward after he’d risen from the dead several times.

Because I assessed Eric as being a man of faith, I trusted him to have faith in me to not defect from the implicit collusion. There were other players at the final table that I didn’t trust to know the strategy, nor did I trust they would not betray me.

Given my read on Eric as a “transrational” player, I made the flat call. This signaled back to him that I was also down for the conspiracy. 

Mathematically, you can see the power of implicit collusion as you watch the percentages change in the video coverage above.

When George’s A2 is compared solely to Eric’s QJ, George has a 57% of winning the hand. Once I make the call with my 97, George’s chances of winning drop to 34%. In other words, my chances (and Eric’s chances) of picking up the $30,000 premium go from 43% to 66% once Eric and I tacitly agree to cooperate.

The flop comes 3-8-7, Eric checks, I check, and George goes all-in as expected. Eric calls and it’s my turn to act.

Now, you may have noticed that once the flop came and paired my 7 and gave me a flush draw, my chances of winning the hand outright skyrocketed from 29% to 73%. I didn’t know what Eric had, but I knew I had a fantastic hand.

In a normal situation my best rational play would be to raise and build a side pot with Eric. After all, the more chips I win now, the more likely I am to move from third place into first.

But in the current context, if I were to bet any amount, Eric would know that I had hit the flop hard and thus he would only call if he had an even stronger hand.

It’s an iron clad principle in poker: never bet a hand that can only be called if it’s beat. So my best chance of winning more from Eric at this point is if he bets into me on the turn or river (and I surmise I still have the best hand).

The turn card comes the 8 of spades, Eric and I both check. George is all-in so he can’t bet anymore, he just has to sweat it out until the showdown.

The river comes the King of spades.

At this point the announcers and audience know (but us players don’t) that I have the best hand, and the only way I can lose is if Eric bluffs me out of the hand. Which certainly would have worked at this point, since my hand which was very strong on the flop, was now vulnerable to Eric having paired a King in his hand or hit a spade flush.

Eric is faced here with a dilemma within the larger dilemma. Putting myself in Eric’s shoes at this juncture, I would be thinking that I might have George beat, because George could literally be playing any two cards in this spot. Eric’s QJ, while not very strong, might be strong enough to beat the all-in player, but not strong enough to beat my hand as well. So there’s a good argument to be made that the rational strategy for Eric here is to make a small bet and hope that I don’t have a strong enough hand to call or raise him with.

Eric had faith though.

He made his best transrational assessment of how to play this hand from the beginning based on the math of implicit collusion. He didn’t waver from that conviction, even when there were temptations and reasons to doubt.

The Prisoner’s Dilemma

In my freshman Western Civ class I was introduced to a concept called the Prisoner’s Dilemma, which is “any situation where self-interest compels multiple parties to act against their collective interest, leading to detrimental outcomes or even destruction”.

From a game-theoretic standpoint, both Golden Balls and Implicit Collusion are examples of a Prisoner’s Dilemma (PD). You can read about the original thought experiment and mathematical model here.

It turns out that PD situations arise everywhere, all the time, not only in human civilization but in plant and animal ecosystems, and also biology. 

For a deep dive on the ubiquity and complexity of PDs, I recommend reading Meditations on Moloch or watching Multipolar Traps: When Evil Thinks It's Good or The Superorganism and Multipolar Traps. Then check out poker champion, Liv Boeree’s YouTube channel, Win-Win.

Or if you only have four minutes you can tune into this delightfully narrated “children’s story”, What if Everybody Did That? 

Here are three very real (and existential) PDs the entire world is engaged in at the moment:

  1. Nuclear Proliferation: Every nation would be safest if no nation had nuclear weapons, but once a single other country acquires one, it’s in every other country’s best interest to acquire two to protect themselves, thus creating an arms race and increasing the risk we will all die.

  2. Environmental Degradation: Everyone is worse off if someone else takes a selfish approach – individuals buying plastic-wrapped products, commercial overfishing, non-renewable national energy policies, etc. —  but economic self-interest trumps altruistic motives at every level, leading to mass extinctions and an increasingly uninhabitable planet.

  3. Technological Dilemmas: We all want access to, and the freedom to use, social media, smartphones and artificial intelligence tools; but the net effect for most of us is clearly negative.

From a rational, game-theoretic perspective, there is no truly stable solution. The incentives for each party to betray the trust of others are too strong to overcome.

The most stable solution to the Prisoner’s Dilemma that math and computer simulations have found is the strategy called “tit for tat.” Which is exactly as it sounds, and you probably learned it on the schoolyard or from your siblings. I’ll play nice as long as you play nice, but if you screw me over, I’ll screw you back.

Tit-for-tat is a way of building trust amongst people who don't know one another, by creating a history of cooperation with one another.

The reason tit-for-tat is not a stable solution is twofold:

  1. It requires there to be future interactions; if this is a one shot deal, then tit-for-tat doesn’t apply and the rational strategy is to simply screw your neighbor first.

  2. Unless those future interactions are infinite (or at least unknown), then there is a strong incentive to play nice mostly… but cheat as often as you can get away with.

It was the Cold War arms race that motivated my Western Civ professor to spend time on it in class. Notably, tit-for-tat loses to the country that fires a preemptive nuclear strike.

So the solution that the United States and Soviet Union came up with is the aptly named policy called MAD: Mutually Assured Destruction.

As the name implies, both nations unilaterally declare the ultimatum: if you launch just one missile, we will unload our entire arsenal to blow your country to smithereens to take you out of the game. Think of it as tit-for-you’re-dead.

So far MAD seems to have worked for preventing World War III. But as this analysis shows, maybe not for much longer.

While it is possible to do better than tit-for-tat and Mutually Assured Destruction, the solution does not (and cannot) arise from GTO strategy, scientific analysis or rational thinking alone.

In Math We Trust

Way back when, if you had extra eggs and wanted to trade them for milk to be delivered later, you had to really be able to trust who you were dealing with. In a local community where people have history together and reputations, trust was easily achievable. 

As societies grew beyond local or tribal units, money was needed as a proxy for trust. You could trade your eggs now for money, knowing that money was inherently valuable to your future trading partners.

The original money was portable commodity items that had intrinsic value to all humans: seeds and grains.

Later we graduated to metal coins. Coins still had inherent value — think about all the things you could make with metals like copper, silver and gold if you melted it down — but also had the advantage that they were not perishable.

With coinage, new trust issues arose such as, how do you know that the merchant didn’t dilute the purity of the gold with less valuable metals? Or shave off some of the metal imperceptibly so there was less of it than you expect? Both of these practices became commonplace, eroding the trust in the monetary systems at the time.

For these and other reasons (like the desire of those in power to levy taxes) trusted intermediaries arose who minted the coins and vouched for their authenticity. One technique for engendering trust was standardizations, like standard weights and sizes, and hard-to-forge insignias. The real trust though came from the fear of imprisonment or violence if someone was caught forging or otherwise cheating. The trusted intermediaries were authorities with police and armies at their command, namely kings or governments.

Later, metal coins evolved into demand notes, which the holder of the note could redeem for physical gold at any time:

Demand notes created a lot of trust. We trusted that gold would be considered valuable to everyone we would want to trade with, as it has been for thousands of years. We trusted the government to make good on its promise to redeem our demand notes for gold.

Then, over the course of the 20th century, we went off of the gold standard and replaced it with “the full faith and credit” in the government to manage the supply and issuance of dollars through a complex and interlocking system of public and private institutions. In the U.S. this includes the Treasury, Federal Reserve and commercial banks. We all learned how misplaced that faith was in 2008 during the global financial crisis.

It was in this context that the Bitcoin protocol was invented. The creator figured out that we could replace trusted intermediaries with objectively verifiable mathematics.

The Bitcoin solution to money is an inverted Prisoner’s Dilemma: the more everyone acts out of greed and fear, the more trustworthy the money is. In other words, Bitcoin is the game theory optimal form of money.

The profound implications of Bitcoin’s rational solution to the Prisoner’s Dilemma of money cannot be overstated. Those who understand the game theoretics know that Bitcoin will inevitably replace fiat currencies (like the U.S. Dollar, Euro and Yuan) as the de facto form of money for individuals and small businesses in the years to come. This is why most governments are scared shitless and are trying everything they can to shut it down and to keep normal citizens from using it. But behind closed doors, they admit that Bitcoin is unstoppable, in the same way that the internet is unstoppable.

My belief though, and my hope, is that once the inevitable happens and individuals are free from the tyranny of government/corporate issued money, that we will quickly realize that we have the power to program new forms of money, based not on fear and greed, but rather on cooperation and shared values.

What we need is transrational money.


bottom of page