Saturday, September 14, 2013

Poker: running cards multiple times

Consider the following a quick exercise in combinatorics. We are investigating the effects of running cards twice. You can see a real life example here. It is known that the EV doesn't change when you run multiple times (but you lower your variance). Let us check this claim.

Let's take the case of KK vs AA allin after a blank flop. After the flop, there are 45 cards left. If we run it once:

EV for KK = Pr(K on turn and no A on river) + Pr(no A on turn and K on river) + Pr(K's on turn and river) = 2/45 * 41/44 + 41/45 * 2/44 + 2/45 * 1/44 = 8.383838...%

Notice that the first two terms are the same because turn/river is interchangeable. Double checking this on pokerstove and using a flop with 0 chances of runner runner flush/straights, we get 8.384%. Nice. Exact.

Let's say we run it the second time. A couple possibilities in the first run:

  • one A came out (2/45 * 41/44 * 2 = 8.2828%)
    • then EV for second run is 2/43 * 40/42 * 2 + 2/43 * 1/42 = 8.9701%
  • two A's came out (2/45 * 1/44 = 0.1010%)
    • then EV for second run is 2/43 * 2 - 2/43 * 1/42 == 2/43 * 41/42 * 2 + 2/43*1/42 == 9.1915%
  • one K and one A came out (2/45 * 2/44 *2 = 0.4040%)
    • then EV for second run is 1/43 * 41/42 * 2 = 4.5404%
  • one K came out and no A's came out (2/45 * 41/44 *2 = 8.2828%)
    • then EV for second run is 1/43 * 40/42 * 2 = 4.4297%
  • two K's came out (2/45 * 1/44 = 0.1010%)
    • then EV for second run is 0%
  • no A/K came out (41/45 * 40/44 * 2 )
    • then EV for second run is 2/43 * 39/42 * 2 + 2/43 * 1/42 = 8.7486%
The above EVs were also double checked with pokerstove using dead cards (A, K and blanks) and should be the exact probabilities. Adding these all up, the EV over all cases for the second run is 8.3838%- same as the first run.

Friday, August 30, 2013

Poker: Pot Odds 2

Some follow up thoughts on the first post about odds. To get the most precise numbers for your hand's equity/odds, you should compare pot odds with your hand's odds of hitting its outs.

So using the same scenario from the first post, the third/most precise method is as follows:

Method 3 == Pot Odds vs Odds of Hitting Outs
Pot odds == 1:5
Your hand has 8 outs. On the turn, there is 52 - 5 = 47 cards left. Your odds of hitting  outs is 8 : (47-8) == 8:39 == 1:4.875
Since pot odds > odds of hitting outs, you should call.

So note that this is much more precise, and only deals with the turn card (ie. if you had read the second post about bet sizing, this assumes that you will bet optimally on the turn again and that effective stack size is large enough for optimal bet)

However, the disadvantage of pot odds is that when the probability is small, the odds change very rapidly to any small change in probability. ie. 1:49 vs 1:99 is actually only 1% apart. So whereas for equity calculation, you are "safe" if you have say a 5% buffer between required equity vs hand equity, here there is no similar rule and you might be easily mislead as to how much edge you have.

Let's look at what a 5% equity edge means for different pot odds.

Bet SizePot OddsRequired # of Turn Outs Without BufferEquity + 5% BufferRequired Hand Odds With BufferRequired # of Turn Outs With Buffer
1/4x pot1:57.821.7%1:3.610.2
1/2x pot1:311.830.0%1:2.314.2
1x pot1:215.738.3%1:1.618.1
2x pot1:1.518.845%1:1.221.4

Without the 5% buffer, the Required Hand Odds should equal pot odds. Instead, as you can see, 1:5 -> 1:3.6, while 1:1.5 -> 1:1.2. Hence there is no easy way to build in some buffer with pot odds. However, you may have noticed that the required outs actually increases by a constant (2.4 outs) when you build in a 5% buffer. This makes perfect sense- 2.4 outs --> an extra 5% chance of drawing out.

Thus the optimal way to give yourself a buffer is to calculate the odds as is, but then require a couple extra outs to be conservative.

Interestingly, if you look at the # of outs required for 1x pot and 2x pot bets on the flop, they are actually "ahead" (> 50% chance of drawing out by the river). So it seems as though any drawing hand that could call a > pot sized bet could also just raise or push all-in. More on this next.

Wednesday, August 28, 2013

Poker: Bet Sizing

Following the last introductory post to poker, here is an example on how to determine bet sizing based on the texture of the flop.

Let's say effective stack size is 200 BB, we are in button/cutoff, it was limp/folded to us, and we raised to 4-6 BB and one player called. So the pot is ~10 BB (8.5-13.5 BB depending on dead money/size of raise) and effective stack size is 195 BB.

Let's say the flop comes with two flush cards (two cards of same suit) and you have top pair top kicker and you are committed to calling the all-in even if the another flush card comes because you have a weird hunch/you are tilting etc. Let's look at what you need to bet taking into account the implied pot to push a flush draw opponent out of the pot.

The flush draw opponent has 36% equity -> you want to lay 1.8:1 odds for optimal play -> with an implied pot of 10 BB + 195BB, that is a bet of 205/1.8 = 114 BB. This sounds ridiculously huge and incorrect. There are two reasons for this. 

  1. Being willing to commit 200 BB on a flush board with top pair is an incorrect play. 
  2. The existence of a turn bet means that the true "correct bet" amt is much lower. 
We will look at these two lines of thought below and examine what the "correct bet" should be if we take such factors into account..

How much you would typically/normally commit with top pair top kicker
Generally, with top pair or worse, you want to control the pot to be medium sized. Let's look at what that means for the flop/turn/river betting rounds:

  1. if betting sizes were pot sized, I would say two bets would already be really pushing it. (ie. you would control the size by either bet/check/bet or bet/bet/check etc). In this case the pot would be 90 BB by showdown and you and your opponent would have each put in another 40 BB.
  2. if betting sizes were 1/2 pot, it is probably feasible to bet on all three rounds. In this case the pot would just be 80 BB (35BB each since flop). But betting like this might be stupid with the flush draw on the board.

From these rough estimates, it seems that we should be willing to commit another 35-40 BB to this 10 BB pot. Let's say that the implied pot size is 10 + 35 BB. Then the optimal bet is 25BB, or 2.5x the pre-money pot. This certainly seems more reasonable but from what we know empirically about poker betting, it still seems to be on the high end. Let's move on to consider the fact that there can still be betting on the turn/river.

Betting on the turn

From the conclusion of last post, remember that by betting optimally against a dog, each time you bet, you are effectively causing them to loose money equivalent to them folding to the bet in the long run. I would thus make the following statement:

By betting optimally on the turn, it is as though we have cut off the opponent's chances to draw on the river. ie. it is as though the opponent loses the pot right here on the turn.

Now, we can see that perhaps our opponent doesn't have 36% equity, since after they see one card on the turn, we can bet again, which is equivalent to cutting them out right there. So they have 9 outs out of 47 possible turn cards. So we could lay 38:9 ~= 4.2:1 odds to price them out. With a 45BB implied pot, this means betting 10.7 BB ~= 1.1x the pre-money pot. This is more inline with what we know as "normal" betting. 

What is interesting is that if you follow this line of reasoning/betting on the flop, you are committed to betting the turn 100% of the time if no flush cards come out- otherwise you are actually giving your opponent 36% equity and good odds to call on the flop.

So it turns out that maybe our flop bet sizing could vary depending on our game plan on the turn.

As a matter of exercise, let's look at what our proper bet on the turn is. Let's say we just bet another 1x pot on the flop- so the turn pot is 30BB (and we are ready to commit another 25-30BB). What is the optimal bet on the turn if no flush cards come? They have 9 outs out of 46 river cards, so we lay 37:9 ~= 4.1:1 odds. With an implied pot of 60BB, this means that betting 15xBB = 1/2 the pre-money pot.

So either you bet [1.1x pot on flop AND 0.5x pot on turn], or maybe you need to overbet on the flop by betting (at most?) 2.5x pot if you intend to (sometimes? check the turn.

Friday, August 23, 2013

Poker: Intro to Pot Odds

Here's some useful background for anyone interested in poker. Frequently, when someone bets into you, and you are chasing a draw, you need to decide if you want to call or not. Below, we look at a couple ways to think about this.

Before we start, one important definition to get out of the way. The term equity/pot equity may be confusingly used to refer to a % of the pot or an actual $ amount. I will use %equity and $equity to differentiate between these. Both %equity and $equity look at what would happen on average if your hand was taken to show down without any further betting.

Take an example on the turn where the pot is $100. then the opponent bets $25 into pot. We need to decide if we call or not. 

Simple Pot Odds
We are doing the analysis below with the assumption that opponent is all-in (or that there is no more betting on the river). Intuitively, we might already think that it is cheap to call (we can call because the bet was small)- let's see what the math says:

Method 1 == pot odds required probability vs %equity

Pot Odds
  • the opponent just bet 1/4 of the pot.
  • the pot odds that you are getting is 25:125 == 1:5
  • ie. pot odds are post money (inclusive of bet)
  • from the pot odds, the required equity (probability of winning) in order to call is 1/(5+1) = 1/6 = 17%
%Equity (== Your Hand's Actual Probability to Win)
  • looking at your hand vs your opponent's range, what is your probability of winning?
  • let's say you put your opponent on a pair or better and you just have a straight draw.
  • you only win if u hit your draw (8 outs = 18% chance of hitting)
Now compare pot odds probability (17%) to your hand's probability to win (18%) and since hand probability > the probability required from pot odds, this is callable. (maybe in practice you might demand say a 5% buffer before saying it's callable?)

Method 2 == EV calculation
folding = $0 
calling = 18% * 125 + 82% * (-25) = $2
so it's +EV to call. so call
Notice that here, you already take into acct the pot size (125) and the bet size (25).
Compare this to method 1 (the pot odds vs probability method) - the probability calculation doesn't take into acct the bet/pot ratio and that's why you need to compare pot odds to it in method 1.

I think in practice, method 1 is actually easier to work out over the board.

Implied Pot Odds

Now let's say we were not allin (there is more betting on the river). Let's say there's another $50 behind in effective stack size after the call. it's actually very easy to do implied pot odds
25:(125+50) == 1:7
that's it. required equity is 12.5%.
no chg with hand equity. so 12.5% vs 18% == much bigger reason to call/you have much more juice

Looking at the EV method, this is 18% * 175 + 82% * (-25) = $11

Benefits of Offering the Correct Odds

On a side note, i think it is interesting to look at what the EV # means. One way to think about the expected value of your profit at each situation is your $equity - money put in. At each point of decision when you have to decide between raise/bet/call/check/fold, you are seeking to maximize your incremental profit.

Let's go through the scenario above where you and your opponent each put in $50 before the turn, and have $25 each left. At the beginning of the turn, your $equity is $18, so your accumulated profit since the start of the hand is $18 - $50 = -$32.
  1. if you could check it down (opponent hadn't bet) == you would get avg $18 from the pot of $100. Your incremental profit in this scenario is $0. Your accumulated profit since the start of the hand is still -$32.
  2. Opponent bets and you fold. You $equity dropped to $0 here from $18, and you also didn't put any more money in. So your incremental profit for choosing this option is -$18. Your accumulated profit is now -$32 - $18 = -$50.
  3. Opponent bets and you call $25 allin. You would get avg 18%*150 = $27 from the $150 pot. Your $equity increased from $18 to $27, but you also spent $25 calling. Your incremental profit = +$9 -$25 = -$16. Add this to your pre-calling accum profit of -$32 before to see that your post-calling accum profit is -$48.
Note that if you could, you would still much rather get option #1 than having to choose between #2 and #3. In option #3, your are choosing an action that has -EV (you lose another $16). However, choosing option #2 would have even worse consequences (-$18). All this is because your opponent had bet out at you when you had <50% in %equity. You either put in more chips being the underdog, or you fold- effectively losing your pot equity (the chance to draw out on the winner).

This has very interesting implications for when you are playing/betting optimally with the winning hand. 

By making an optimal bet, you win exactly your opponent's $equity since they should be apathetic to folding.

Quick example to show this again: lets say your opponent still has 20% chance (one in five) to draw out on you in a $100 implied pre-money pot. Right now, your accum profit is 80 - 50 = $30. Optimally, you would lay 1:4 odds post-money, or 1/3 of the pre-money pot == $33. If we show that when you make this $33 bet, you are increasing your accum profit from $30 to $50, then we have shown a working example of the statement made above. It is obvious if opponent folds. If opponent calls, then your expected payoff is 80% * 166 = 133 and your cost is 50 + 33 = 83. Accum profit = 133 - 83 = 50. 

Note that this optimal bet sizing is most important to get right when it comes to closer draws (ie. it matter less when you are 90/10 favorite already). I claim this because with a close draw (say 60/40), the opponent still has 40% equity in the pot, so betting correctly to win that 40% equity is likely to be hugely lucrative, vs winning the 10% equity is less so. This may be a reason for why we have more freedom to slow play with trips etc when we are already 90%+ favorites.

Thursday, June 13, 2013


Some background info: In Sept 2011, Citigroup + Bloomberg hired a partnership of two companies, Alta and PBSC/Bixi, to create a bike share program for NYC to be delivered in July 2012. PBSC/Bixi is a private non-profit company started by the city of Montreal to create their bike share program back in 2007. After Montreal's success, PBSC/Bixi started expanding internationally, with the aid of Alta, a US company. In the US, Boston and DC are two examples of successful Alta-PBSC/Bixi systems that went off without a hitch. In all those cases (Montreal, Boston, DC), PBSC/Bixi hired a company called 8D Technologies to develop the hardware (terminals, docks, etc) and most importantly, software to manage the thousands of bikes in a typical large city bike share system.

However, in a surprise move, PBSC/Bixi fired 8D Technologies in 2012 and decided to create its own software from scratch. This is the main reason why the NYC program was delayed for a year, despite what they may claim about the effects of Hurricane Sandy. 

It seems that Mayor Bloomberg knew it was the software that caused the delay. In July 2012, Bloomberg said of Citibike: "its software isn't working yet. And just rest assured we're not going to put out any program here that doesn't work." Well, with an estimated 10% of Citibike docks failing every day (according to WNYC), it looks like they did.

Although NYC officials knew that software was the cause of the delay, they didn't know that the true reason was a complete bait and switch of the promised product: "We thought that there would be a substantial transfer of the Washington/Boston software capabilities, not a total rewrite, which is why we thought a July [2012] launch was feasible. But it turned out it's not just a software upgrade." - Gotham Gazette

Not only was NYC left out of the loop, but it seems that their partner, Alta, was unaware as well. "In January [2013] , records show, Alta filed a lawsuit in an Oregon circuit court against Public Bike System Company [PBSC], saying it delivered 'nonconforming goods and faulty goods' to New York's bike-share program. Alta said this week that the suit was never served and that the groups remained partners." - NYT

The main question is, why was 8D Technologies fired? As in all things business, the answer comes down to money. 8D has sued PBSC/Bixi for $26 billion and in response, PBSC/Bixi has countersued $2.5 billion claiming that 8D overbilled for their products. However, if 8D truly overbilled, why was PBSC/Bixi okay with using 8D for Montreal, Boston and DC? You would think that buying another license for a software you've already used several times would be much cheaper than creating your own code from scratch.

It gets even more interesting when one looks closer at PBSC/Bixi. Although it's a private company, it is implicitly backed by the City of Montreal. In 2012, Bixi "was on the brink of collapse and the City of Montreal provided a $37 million loan and guaranteed $71 million in credit. At the same time, the city auditor told BIXI to sell off its international programs since a Quebec municipality cannot participate in commercial activities." Apparently, PBSC/Bixi has had financial issues for years. thetransitwire

I think I know what's going on.

PBSC/Bixi is bleeding money and thought it could save costs by cutting out 8D. They've been working together for so long, PBSC/Bixi figured it could reverse-engineer 8D's software from their previous projects together. This raises major issues concerning intellectual property, as 8D stated in their lawsuit

In order to meet previously promised deadlines, the new software was delivered before it was a fully functioning product, even with its additional one year of development. Amazingly, it has only been previously launched in one other city, Chattanooga, Tennessee (which also suffered delays), which is hardly comparable in scope to NYC's program, which is the largest in the US. Chattanooga seems to have suffered the bait and switch as well: "In Tennessee, the break between 8D and PSBC caught officials off-guard. 'That was not clear to us when the initial contract was awarded'".

Next month, Chicago will get a very similar version of our bike share program. Unsurprisingly, it has been delayed by two weeks. San Francisco has signed on with Alta-PBSC/Bixi as well.

TL;DR: an essentially bankrupt Canadian semi-public company promised NYC one product, but instead delivered an inferior beta version of an untested clone. Furthermore, it's unsure whether they will commit to fixing anything, since PBSC/Bixi's international operations are being sold off and Alta is being sued for unpaid wages by its employees. The only upside is that, unique to all implementations of bike sharing programs, the costs of our's has been completely privately funded by Citigroup.

Saturday, May 18, 2013

Random Thoughts: Desktop Efficiency for Linux

Also see my post on windows desktop efficiency.

I just spent some time making my laptop dual boot ubuntu and windows. While I have played with ubuntu before, I never took the time to customize it for efficiency before. So here are my thoughts.

  1. windows shortcuts still work
    • win + 1,2,3 etc still calls your applications on the task bar
    • I added win + d to show desktop
  2. new shortcuts/desktop management tools
    • a bunch of ctrl+alt shortcuts for manipulating workspace
    • this takes the place of stuff like win + arrow key: presumably you rarely need to tile windows because you can put stuff into different workspaces. There are also windows tiling managers but I haven't looked into that yet. (CompizConfig has a plugin called grid that does this)
  3. for autohotkey equivalents
    • xmodmap for the basic mappings. I'm thinking that this is actually a lot better than my current windows solution- more below.
    • autokey is pretty well developed- you can apparently call python scripts from it
My new ubuntu key mappings maps mode_switch to tab (ie. altgr without the alt). This is a whole new key modifier. (ie. imagine, in addition to ctrl, alt, win, you also have mode_switch) When I was in windows, I had some problems with overloading the functions of the ctrl key For example, I wanted ctrl+l to be right arrow, but then I also wanted ctrl+right to move one word right). My solution then was to differentiate between right ctrl and left ctrl. Rctrl + j -> down, Lctrl + j -> Ctrl + down.

Now, I have my own modifier separated from normal Ctrl functions.This allows you to keep stuff like ctrl + h in your browser to be history (that was a problem before).

Had some installation difficulties along the way, but all-in-all seems quite manageable.

Friday, April 26, 2013

Computer Security

Some thoughts about potential options to secure your computer

  1. setup 2 step authorizations (eg: google can send a pin to your phone, or banks have pin generator for a second password)
  2. isolating threats. ie: when paying with a credit card, you could
    1. use a VM
    2. use Tor
    3. dual boot to a separate OS
    4. setup a computer and remote access into it
  3. software: firewall, regular virus scans, regular spyware scans
  4. monitors: monitor network traffic here
  5. reformat your actual computer periodically

Monday, April 22, 2013

Deflation and Debt

Prem Watsa has the reputation of being Canada's Warren Buffett. Unlike Buffett, however, he has made pretty significant macro call on deflation:

Despite the fact that central banks from all around the world are explicitly trying to create inflation, Prem believes that the forces of de-leveraging are too strong for the money printers to counteract. (Read anything written by Van Hoisington if you want more detail on the deflation trade). As such, Fairfax has entered into large, long-term, CPI-linked derivatives that benefit from deflation. In addition, given that de-leveraging and deflation would likely have a strongly negative effect on the value of financial assets such as stocks, Fairfax has hedged 100% of its equity portfolio.

This is an incredibly pessimistic view of the Fed's monetary powers. He cites Japan's multi-decade deflation as an example of what he envisions for the future:

In Japan, after the bubble crashed, it took 5 years to actually see deflation
- They then saw cumulative deflation for the next 17 years
- It takes time for people to understand that they actually have to de-lever
and that there is no other option
- Prem's view is that there is a possibility of deflation in the US
- Since 2008 we have had a ton of stimulus and Fed monetary
- In spite of that the economy is weak and there is no
inflation in sight

I'm not sure if he's right. Although the Federal Reserve may not be able to control long term real variables such as growth and unemployment (as they are currently attempting to do so with QE3), I do believe that they have absolute power to manipulate nominal variables. Put another way, Bernanke can't force you to spend your dollar, but he does have control over how many dollars are out there and thus the relative value of your dollar.

The idea that too much debt leads to deflation is creatively called debt deflation. The process is simple: the higher your debt, the more likely you are to use your money to pay it off (rather than spend or save your money). You may even begin to sell off your assets to pay off this debt if your income is insufficient (a process known as deleveraging). Systematic selling leads to a decline in prices, as simple supply and demand analysis would have it (more sellers than buyers means lower prices). However, as prices go lower, you receive less for your asset sales, making it harder to pay off your debt, precipitating even more asset sales.

This is analogous to the paradox of thrift you learn in Econ 101 (where if everyone simultaneously tries to save more money, the aggregate level of savings will decline): if everyone tries to pay down debt, the aggregate debt level actually increases. This is why economics is split into micro and macro universes, individual behavior looks very different when aggregated.

Are we seeing this feedback loop between deleveraging and deflation right now in the US?

Inflation is currently running around 2%, right around the Fed's target. What about deleveraging? McKinsey Global Institute (the research arm of the management consultancy) had an excellent report last year which showed that the US private sector is ahead of all other developed countries in its progress of paying down debt. In fact, there are some signs of releveraging (taking on more debt, or the opposite of deleveraging) in the US, which is a sign of increased confidence, as corporations have started issuing debt to take advantage of record low interest rates. Home prices are starting to increase, which makes households richer, and more confident about taking on debt (such as mortgages, which is money borrowed against the value of your home).

The Fed has effectively broken the link between deleveraging and deflation. If anything, we may have to worry about over-inflation (I hesitate to say hyper-inflation), as massive amount of reserves enter the system (but that's a topic for another blog post).

On a different note, once value guys (like Prem Watsa) start making macro calls, maybe that's when macro guys should start making bottom-up security recommendations. In that spirit, here are list of stocks I like.
    Consumer Discretionary (2 securities)
  3. Consumer Staples (3 securities)
  7. Energy (2 securities)
  10. Financials (1 security)
  12. Health Care (1 security)
  14. Industrials (4 securities)
  18. L-3 COMM HLDGS
  21. Materials (3 securities)
Disclaimer: this is not to be taken as any form of investment recommendation.

Andy Zhang

Thursday, March 28, 2013

Government for Sale

1,820 years ago, on this day, March 28th, there was a change in political leadership in the Roman Empire. Normally, this wouldn't be worth commenting on, but what was special about this political transition was how new emperor was chosen: by auction.

After the murder of Pertinax (28 March 193), the Praetorian assassins announced that the throne was to be sold to the man who would pay the highest price. Titus Flavius Claudius Sulpicianus, prefect of the city, father-in-law of the murdered emperor, being at that moment in the camp to which he had been sent to calm the troops, began making offers, when Julianus, having been roused from a banquet by his wife and daughter, arrived in all haste, and being unable to gain admission, stood before the gate, and with a loud voice competed for the prize.
As the bidding went on, the soldiers reported to each of the two competitors, the one within the fortifications, the other outside the rampart, the sum offered by his rival. Eventually Sulpicianus promised 20,000 sesterces to every soldier; Julianus, fearing that Sulpicianus would gain the throne, then offered 25,000. The guards immediately closed with the offer of Julianus, threw open the gates, saluted him by the name of Caesar, and proclaimed him emperor.

Interesting system. Instead of millions of dollars being wasted on ad campaigns, give it to the people directly in exchange for political power. Let's LBO the government! (I wonder how the valuation models would work?)

Monday, February 18, 2013

Fun ways to turn down recruiters

A coworker of mine got an email with a perl script the other day from a recruiter.  The script didn't work, but the code implied that the recruiter wanted to tell him that his salary would jump 100k with his help.  My coworker sent him the following in his response:
perl -e "print (join '', (map chr, (78, 79, 33))) while 1;"
Another valid response that I composed was this:
printf '\x4e\x4f' | xargs yes 
How do you turn down recruiters that you don't like?

Monday, February 4, 2013

Debt is Good

Let us construct a very simple economy with only two individuals. Let's call this world Simpletonia. These individuals, or Simpletons, will be known as Gordon and Peter.

In this world, there is only one type of resource (imaginatively named Resource) which is non-perishable and has zero storage costs. It is necessary to consume 1 unit of Resource a day in order to survive. Resource is located in one of two types of Deposits on this world - one low-yield and one high-yield - from which the Resource must be gathered before it can be consumed.

The first type of Deposit yields 2 Resources at the end of a day of continuous gathering. The second Deposit yields 1000 Resources as a lump sum at the end of a year of continuous gathering. Needless to say, each Simpleton can only gather from one deposit at a time.

It is clear that the second Deposit offers a higher return than the first (1000 Resource / 365 days = 2.7 Resource/day > 2 Resource/day). Thus, each Resource-maximizing Simpleton should gather from the high-yield Deposit as soon as possible. In a pre-debt economy, this cannot happen until each Simpleton has stockpiled enough Resource by gathering from the low-yield Deposit for a whole year. In a debt economy, Peter can immediately start gathering from the high-yield Deposit while Gordon gathers from the low-yield Deposit. This happens because instead of saving his extra unconsumed Resource in storage (where it will be a zero-yield asset), Gordon can invest in Peter by lending him 1 Resource a day.

Would you rather be indebted or debt-free? The debtor (in this example, Peter) isn't always worse off than the creditor (Gordon). Peter can probably default with little or no immediate consequence since he has no tangible assets for Gordon to seize. However, this will be sub-optimal for Peter in the long term, as Gordon will be very unlikely to lend to Peter again, and thus they will revert to the pre-debt economy in which access to the high-yield Deposit is delayed by a year.

In fact, this is very similar to the famous Iterated Prisoner's Dilemma from game theory. Just as in that game, tit-for-tat is the close-to-optimal strategy (if you cooperate -> I will cooperate in the next round, if you don't cooperate -> I will not cooperate in the next round). Thus, reputation is very important in a debt economy, which is why individuals have credit scores, corporations have credit ratings, and Argentina hasn't had access to international credit markets since they defaulted in 2002.

This is a highly simplified model, but we can see that the debt economy will do vastly better than the pre-debt economy: they have access to the high-yield Deposit, and thus more Resource, an entire year earlier. A few other things are immediately evident: debt as a whole nets to zero. How much everyone owes cannot be more than how much everyone is owed. One Simpleton's debt is another's asset. Gordon's surplus is Peter's deficit.

Even though debt always nets to zero, economic wealth isn't a zero-sum game. It is clear that Gordon and Peter will both be richer from this arrangement. Debt, in and of itself, isn't bad, because there necessarily always is someone in debt.

Finally, why is this relevant? If we rename Gordon "Government" and Peter "Private Sector" and if Gordon's surplus is Peter's deficit (as shown above), then Government surplus = Private Sector deficit. Similarly, Government deficit = Private Sector surplus. Thus, if the private sector as a whole wants to pay down its debts through generating surpluses (as it has been trying to since the 2008 financial crisis), the only way for this to happen is for the government to incur deficits. In fact, some simpletons even believe that this is justification for the government to actively pursue a policy of large deficits in order to aid in private sector deleveraging.

Saturday, February 2, 2013

Passwords and bcrypt

With the recent news of Twitter losing up to a quarter of a million passwords and The New York Times' computer systems being compromised now sounds like a good time to discuss some computer security and why you might not have to panic about all those accounts for which you've used the same password (thanks to bcrypt!).  This post focuses on what happens once someone steals your password file and not on how to protect that file, which is a whole different story.

Firstly, what does it mean when someone breaks into a computer system and steals your password?  In the Twitter case, it means that someone was somehow able to grab a salted and hashed password.  I'll explain what that means in a bit, but what the attacker gets is a list of passwords that look roughly like this:


Although you would never know it from looking at it, this is what your password actually looks like in a website's databases.  Now, that looks nothing like your password that you type into login pages, and that is because they are salted and hashed.  Hashing is a fancy way of saying that a password has been transformed in such a way that it is basically impossible to know the original text from the ciphertext (result of the transformation). For instance, using the md5 hashing algorithm, we can turn the input 'coolcat' into "f7f70fefd4bed17191df6ec8bc24c63d" (try it out with this online md5 generator).  Notice that the hash for 'coolcat2' is "0054cc598c64e0780ddcc5ed7798c1c6", which is radically different, even though they're off by just one character.  This is what hash algorithms are like: give me something and I'll give you random gibberish.  Note: I use hashing function and hashing algorithm interchangeably.

Cryptographic hashing functions, which are just hashing functions that are used to deal with stuff that needs to stay secret need to satisfy some basic rules (ripped from Wikipedia):

  1. it is easy to compute the hash value for any given message
  2. it is infeasible to generate a message that has a given hash
  3. it is infeasible to modify a message without changing the hash
  4. it is infeasible to find two different messages with the same hash
When taken in the context of passwords, these four rules make a lot of sense.  The first means that it should be easy to take a password like 'coolcat' and get the hashed result, because that's how your password is checked.  When you type 'coolcat' into that password box to log in to Twitter a hashing algorithm takes it and makes it into the hashed result, and if that result matches what they've stored when you signed up, then they know the password you typed is correct, because of rules #3 and #4. These two rules basically mean that you can't come up with a different password that somehow also hashes to the same thing as your password, because that would be whats called a collision (if this did not hold, it could mean that 'coolcat' and 'dumbcat' could both be used to grant access to your account, allowing random racist tweets!).  Number 2 is a little more subtle, but it means that given a hash you can't go from that hash to the original password (it is nearly impossible to find the inverse of the hash function).

The tl;dr is that when you hash a password, it becomes impossible to know what the original password was...unless.  That unless is unless you try hashing that password.  Going back to our Twitter example, this means that the hashed passwords the attackers got are useless until they crack them.  

To figure out the original passwords, what the attackers now need to do is to try different combinations of letters, numbers and symbols until they find a hash that matches one they stole - when that happens, they know your password.  Do this enough and eventually you'll have all 250k passwords, but how long does that take?  Given md5, assuming that all passwords are alphanumeric and 8 characters in length (2,821,109,907,456 - I had to write it out, for effect) then it takes...5.64 seconds.  This is based on slightly out of date estimates by Coda Hale on his blog (whose post was an inspiration for this one) that for $300/hr you can rent a CPU to do 500 billion passwords per second.

So what's the point of hashing (and this article) if it only takes 5.64 seconds to crack 2.8 billion potential passwords?  I know I've been rambling a bit, but stick with me because shit is going to get real.

Passwords were originally hashed using algorithms like md5, or SHA-256, and that was fine, because computers were really slow, so it would take a thousand years to crack any meaningful number of passwords.  As we've been able to cram more power onto tiny silicon chips, we've also made our hashing algorithms obsolete.  This is where bcrypt comes in.

Whereas md5 or other cryptographic hash functions are designed to consume as much data as quickly as possible (md5 is often used to verify quickly the integrity of large files, such as videos), brcrypt was designed to do this very slowly!  It was designed to be intentionally slow and intentionally difficult to parallelize to compensate for ever increasing hardware speeds, and was designed to protect your password even when the hashed password was stolen. 

To learn more about this, lets look at the hashed example I provided earlier, which is actually what a bcrypt hashed password looks like:


Here's an explanation of the password from Stack Overflow:
  • 2a identifies the bcrypt algorithm version that was used.
  • 10 is the cost factor; 210 iterations of the key derivation function are used (which is not enough, by the way. I'd recommend a cost of 12 or more.)
  • vI8aWBnW3fID.ZQ4/zo1G.q1lRps.9cGLcZEiGDMVr5yUP1KUOYTa is the salt and the cipher text, concatenated and encoded in a modified Base-64. The first 22 characters decode to a 16-byte value for the salt. The remaining characters are cipher text to be compared for authentication.
  • $ are used as delimiters for the header section of the hash.
Lets forget about the version and the salt for now, though I've been putting off the salt for a while.  The real thing we need to focus on here is the cost factor.  bcrypt takes two inputs to generate a hashed password: the password itself, and a cost factor.  The cost factor, as stated above, determines how expensive it is to compute the bcrypt hash.  By adding one to the cost factor, you increase the cost of bcrypt, by a factor of two.  This is huge, because by adding something like 4, you've now made it cost 16 times more to calculate a password.  Of course, if you've looking to log in to a website where you know a password, waiting 1/2 a second for your password to be hashed is ok, but if you're computing trillions of passwords, getting all trillion at 2 passwords per second is...impossible.

So bcrypt is intentionally slow, but there's other really cool stuff about it.  Even though its slow, it only uses operations (xor and other bit operations) that computers are already really good at.  This makes it really difficult to design a specific circuit that is even better at cracking bcrypt passwords than modern cpus (this attack has been done for other hashing algorithms, notably DES which used bitslicing, a relatively expensive operation for cpus).  Another really interesting thing about is that it's very hard to parallelize.  Every step of the hashing involves the inputs to be hashed by the other inputs, so each input to the next step is dependent on the previous step.  In simpler terms, its impossible to parallelize something where the input to A depends on the result of B, which in turn depends on the result of C (and so forth, 2^10 times).  This is important because much of the speedup in processing power and computation in the past few years has been because of advances in hardware and algorithms that allow us to better parallelize computation.

So, now we know that bcrypt is really cool, and hopefully if Twitter used bcrypt with a high enough cost factor, your password may be safe (if it was a good password with enough letters, numbers symbols, blah blah).  The original paper is really cool, and if you're technically inclined, you should really check it out here.

One last thing (extra credit I suppose) are the salts.  I introduced them in the beginning and they came up in the explanation of bcrypt.  A salt is something that solves the problem of people not picking unique passwords, and that given enough time, you can pre-compute a vast number of passwords.  Think about it this way: 'coolcat' will always hash to the same value when given the same inputs, so once 'coolcat' has been cracked as a password, you know that password pretty much forever.  However, if we add some random element to your password 'coolcat', such as the number 1203 and then store that in plain text (not hashed), then we've added another level of difficulty to cracking your password (note: the random salt is added in by the program, not by the user).  The attacker now needs to compute your password with the addition of '1203', but where it really hurts is when every user has a different random number attached to their password.  Cracking 'coolcat1203' is completely separate from cracking 'coolcat285'.  Now we can no longer pre-compute passwords (given a large enough random salt) because there are simply too many possible salts to use with your password.  When someone gets your hashed, salted password, they need to crack that 'coolcat' password just for you.  bcrypt uses a 128 bit salt, which means that the salt can be any number between 0 and 2^128.

Thursday, January 24, 2013

Macro Themes for 2013 (part I)

  1. Man vs. Machine
    • Productivity gains will outstrip pace of consumption – technology such as automation, robotics and 3D printing will destroy jobs faster than it create jobs
    • Up to 50 million current jobs could be automated in the future (and thus destroyed)
      • How many jobs will be created through this automation? The answer is however many AI programmers you need (probably far less than 50 million)
      • Firm's labor demand will be for a few skilled workers rather than many unskilled workers
    • Hiring - firms are spending on capex instead of hiring; 75% of US manufacturing firms already employ <20 workers
  2. US Manufacturing Renaissance
    • Localization, or Anti-Globalization
      • Production is being re-shored to be closer to the huge US consumer market and take advantage of local logistics
    • EM (emerging market) are becoming less competitive
      • EM currencies are appreciating such as CNY (Chinese Yuan)
      • China could enter the "middle income trap"
    • Overseas transportation are too high due to energy prices, incentivizing firms to repatriate
    • Underinvestment and low capex spending in US means pent-up demand
    • Theme 1 - automation and robotics override low labor costs
  3. US Energy Boom
    • EIA forecasts
      • US will become energy independent by 2020
      • Largest natural gas producer by 2015, surpassing Russia
      • Oil output poised to surpass Saudi Arabia’s by 2019
      • Consumption - 87% will be from domestic sources of energy by 2020, up from 79% today 
      • Imports - 13% of consumption by 2020, will be primarily supplied by Canada & Mexico, increasing from 36% of imports today to 62% by 2020.
    • Competitiveness
      • EU suffers from expensive gas contracts with Russia
      • Latam has moved plants to US due to low natural gas and electricity prices
      • Electricity - prices are 50% cheaper in the US than in Europe
      • Roughly 30% of US electricity is generated by burning cheap domestic natural gas
  4. DM (Developed Markets) Aging Demographics
    • Population - baby boomers outnumber millenials due to decreasing fertility rates
    • Labor - baby boomers are retiring later due to recession, crowding out young from workforce
    • Gov't debt - millenials inherit high gov't debt caused by spending on entitlements towards baby boomers
    • Other - high student debt, tight credit, skills mismatch, high job turnover
    • "Peter Pan" generation - millenials reliant on parents, delay adulthood, live at home
  5. DM Big Gov't Socialism
    • Political sentiment will lean towards fairness and equality
    • Theme 1 - high unemployment and inequality will be balanced by redistribution through increased taxes and spending
    • Theme 4 - baby boomers dominate gov't and are biased towards increasing gov't healthcare, pensions, social security, etc.
  6. DM Central Bank Printing
    • Currencies - race to the bottom means depreciation
    • Inflation - will stay low due to tight lending and low velocity
    • Financial repression - captive investors ensure low rates
  7. "Peak Car"
    • Urbanization, high fuel prices, increasing youth insurance premiums
    • Car-sharing schemes - 1 rental equals 15 owned cars; e.g. 700k Zipcar members share only 9k cars
    • Theme 4 - tight credit depresses auto-ownership

Wednesday, January 23, 2013

Random Thoughts: Desktop Efficiency for Windows

Customize your desktop with these free tools to become most efficient

  1. autohotkey
  2. launchy
  3. divvy win-split-revolution (divvy is not free)
  4. .bat files (for windows)
These programs allow you to avoid having to use the mouse/move your hand position and to be able to bring up files/foldes or run commands in less keystrokes)

Autohotkey: map keystrokes to different keystrokes or call different programs (eg: Ctrl+Shift+N to call new word doc). One huge functionality is mapping keystrokes to navigation commands so that you don't need to move your hand away from the keyboard. eg: vi (an editor) maps H -> left,  J -> down, K -> up, and L -> right when it is in navigation  mode:

Now you can have that across your whole desktop.

Launchy: start programs/open folders by typing in their names from launchy (brought up by alt-space) instead of searching for it in the start menu. 
Macs already have a similar program included

Divvy Win-split: customized tiling of windows. for example, when I am at my office, my middle screen usually looks like this: web browser taking up ~2/3 width and some space at the bottom (because of the bb launchpad widget)
Divvy Win-split lets me automatically size the browser this way. I wish there was a way to size multiple tabs at once though (eg: put 4 windows into the 4 quarters of the screen)- but that's going to be really complicated (because computer doesn't know what to choose after the active screen)

.bat files: requires some minimal understanding of commandline. Basically write small scripts to let you do whatever you want. You can then use launchy/autohotkey to call these scripts.

Saturday, January 19, 2013

Random Thoughts: New Trends

Want to think about new forms of media made possible by technology, and whether I am fully experiencing/taking advantage of these new improvements

Partially inspired by this ted talk about new forms for comics (eg: infinite canvas).


  1. Much easier referencing + dynamic hide/show functionality --> overview type documents are much more effective and can aesthetically pleasing
    • Imagine a spark notes study guide. Previously publisher needed to cram everything you need to know into the book, which inevitably leads to waste of words/space. eg: There is some stuff you already know so you skip over it, and the whole thing gets bulky. Now, publishers can just list concepts that you need to know. if someone doesn't know it/want to review, they can just click into it. Alternatively, you could select/hover over the text to get details,  or click a button to show/hide details. 
    • Imagine this post, where you first see the first sentence of each bullet pt (a summary of the whole paragraph) instead of the whole paragraph and can choose to see more if interested. Better than just seeing this huge unappetizing blob of text, no? This is maybe something that blogging services can improve on. 
    • Alternatively, one way this has been taken advantage of is linking. What we are doing here- introducing concepts and linking to its descriptions/examples for someone who wants to dig deeper. One potential problem however, is that if you link to someone else's stuff, they might take it down, and others won't be able to access it anymore. There should be software that allows someone to say preserve your own post and also any pages that it directly links to.
    • A similar idea is that contextual information should become more accessible. For example, if I went to a museum, maybe I want to know that there was a period when Picasso painted in blue. Then, when I see this blue painting of his when I am there, I will know, "Aha, this was during his blue period", and not this. Wouldn't it be cool if the museum could just publish some contextual information for each exhibit so the non-art-history majors can read the context beforehand? (or even on their smart phones as they are on site)
  2. Message frequency. Crowd sourcing based news sources have been around. eg: wiki and yelp. However, the ability to email/msg/tweet..etc at the current frequency is actually pretty new, which means:
    • Nowadays, most current news is often found on twitter. I want to start to build out a news network using tweetdeck (has popup notifications for customizable tweets. eg: by author, keyword etc) to supplement conventional news sources.
    • Live blogging. should make an automatic live blog as own diary
  3. Cloud computation: can do super computationally intensive things from your phone (because your phone isn't actually doing it). This is similar to having a secretary you can call who can then lookup xyz for you (in the good old days). Instead, you use anything with internet connection to access your server and the server does computationally intensive things there then the results back to you.

So this brings me to my final thought. This part is inspired by Cassandra.
(I want to make) one app to rule them all- aggregating news sources and combining functionality.
eg: if you read from 100 different blogs, you want an rss reader that aggregates that for you. But that is just the first level. If you read these blogs, these emails, watch these new youtube videos and browse bloomberg at this time, then you could  have something that scans for particular emails, accesses rss feed, and pull articles from bloomberg and delivers to you all at once for your morning reading. Similarly, customize your evening readings, monthly readings etc. And this is just aggregating news sources. You also want to combine different functions eg:
  1. Databasing/record keeping 
    • Note taking/comments, hash tags for easy retrieval later, also see live blogging
    • If pt (1) becomes more popular/developed, then I think there are much better options for databasing the information.
  2. Learning (from now on if x happens I want to do y as the correct response. can the computer remind me or do this for me?)
  3. Automation 
    • autofilter news sources, emails. On a sidenote, autofiltering/personalization without you knowing it is really scary.
    • autoprocessing, eg: supplying useful statistics (this is your 5th email correspondence with this contact. last time you didn't respond to his question even though you flagged it as something to followup on)
    • supply default responses- wouldn't you want to be able to be able to select between "haha", "yeah", "lolol", etc for your text responses instead of having to type that out?
However, one interesting observation is that while creating/using this one app that aggregates and combines functionality may be an "optimal" choice, it is not going to be a widely adopted in real life. Something complicated/personalized is great if you can understand it, and it may get a cult following, but it will never get widespread following. Kinda reversed tragedy of the commons

Wednesday, January 16, 2013


I once won an award for a website that I created with a team that focused on unusual Olympic sports (woah!).  Here are some more unusual sports:

  • Kyz Kuu - "Kyz kuu or kyz kuumai, literally "girl chasing", is a traditional sport among the AzerbaijanisKazakhs and Kyrgyz."
  • Buzkashi - "Buzkashi is often compared to polo. Both games are played between people on horseback, both involve propelling an object toward a goal, and both get fairly rough. However, polo is played with a ball, and buzkashi is played with a headless goat carcass."

Monday, January 14, 2013

Betting/trading strategies- Sizing

One of the mysteries facing finance professionals is how to reconcile the quantitative with the qualitative/discretionary. I actually think most gambling professionals do this quite well (eg. bet sizing on poker)- this may be because the risk/reward is much more well-defined (vs investing). I would like to propose a system to conduct position sizing that integrates the qualitative with the quantitative.

this is dependent on 3 things. 

(1) how risky the thing is on a daily basis (ie. can go up/down by $1 vs $100)
(2) how much conviction you have (ie. i would do something like, 3 conviction levels, lowest = believe can get 5% Return on Risk, mid = 10%, high = 20%)
(3) what is your intended time horizon (or alternative way to say this is take profit/stop loss level)

Then (4) plug into kelly's criterion and take 1/2 kelly as position size

So taking aapl as an example:
(1) daily range is say $15
(2) say you have high conviction (ie. you believe you can make $0.2 vs every $1 you risk, as an average of many bets with this level of conviction. this could mean you make $1.2 half the time vs lose $1 half the time, or that 60% of the time you make $1, and 40% of time you lose $1). I think these conviction levels make sense. 5% = any lower and you should definitely just put it in cash/ST bonds. 20% = anything higher and this is a once in a lifetime/decade type trade, where you really just plunge as much as possible (and sizing is to make sure you can maintain exposure in face of MTM losses)
(3) let's say your intended time horizon is 1yr. then yearly vol is $15 *sqrt(252) ~= 240. this sounds about right (eg; this yr aapl range was from 380-700) 
(4) so every share of aapl (550), you may make +290 (240*1.2) vs lose -240. kelly's = EV/win = 50/290 = 0.17. which means that you should risk 17% of your portfolio.
taking half kelly to be conservative, that is 8.5% of portfolio. which means amt of AAPL shares to buy = your total portfolio value * 8.5% / 240

so eg: on 1mil portfolio, you should buy 355 shares of aapl (195k) if you intended to hold it for 1yr+ and have medium conviction on it. this is about 20% of your portfolio, which is very aggressive sizing already. for long/short equity, anything 10%+ would be considered concentrated. the reason why it is high here is because you have super high conviction assumptions. 

note that
(1) we havn't looked at portfolio correlation yet, which would involve dialing down sizing if you have similar exposures.
(2) this # that we got is the MAX risk you should ever take. ie. anything more is theoretically bad for you (ie. your LT returns will be lower than if you just took less risk). so depending on how risk averse you are, you should be sizing significantly less than kelly. (eg: you could always size 1/4 kelly)
(3) can play around with the skew/kurtosis of returns to get a different sizing. In fact, all the steps above are actually asking you to describe a probability distribution of your return for this trade. (1) is asking for stdev, (2) is asking for mean, (3) is looking at how returns scale with time (is there autocorrelation?) which is also going to be related to kurtosis in this case (+ve autocorrelation = higher kurtosis compared to standard assumptions when scaled up with time) (4) is asking about the skew (are you 50% to win 1.2 and 50% to lose $1, or are you 60% to make $1 and 40% to lose $1)