Optimal stopping for positive expectation

Started by TwoUp, Sep 17, 2024, 03:25 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

TwoUp

When playing even chance we are presented with a predicament of should we stop taking our wins or accepting our losses or continuing.

This is known as "The optimal stopping Problem" and more commonly the dating problem or the 37% rule.

The optimal stopping problem has many variations and is an ongoing area of research, but it provides some precise insights in obtaining an advantage in an even chance game through a process of deciding when it is optimal to stop.


This video introduces the concept of optimal stopping in an entertaining way:



Now there is a special variation of the stopping problem that is solved for a fixed number of events and is partly solved for an indefinite number of events. These problems are known as the Robbins Problem and the related Chow-Robbins game where one tosses a coin and after each toss, you decide if you take the fraction of heads up to now as a payoff, otherwise you continue.

It provides an answer to betting optimally on even chance outcomes and providss a positive expectation.

A simple rule of thumb is if you are up 8 vs 5 or better then stop, otherwise continue until you break even, but we can do much better using a decision table I provide below for the first 1000 spins.

The math shows that when considering a horizon of up to 268 million spins has an upper bound the optimal stopping advantage for a perfect even chance game of 0.79295350640 or 79.2953% which is pretty spectacular for a 50/50 outcome which in simple terms has no long term advantage.

Now practically we cannot play that many outcomes but it is clear that deciding when to stop based on math provides an advantage to the player which I sfar better than 50/50.

The table below covers the opening decisions of the Chow-Robbins game, showing when to stop and when to continue, through computational analysis of the theory and shows all decision points a player has to make up to 1000 spins computed using a 10 million spin horizon.

So for starters let's consider a case where we have won on 19 reds against losing on 14 blacks, (19-14) with the difference therefore being 5.

According to the table, stopping with a difference of 5 is best even up to 23 reds vs 18 blacks "23–18", so we must therefore stop.

As can be seen in the table, the mathematical theory is complete up to a difference of 11, while for a difference of 12 the status of what to do for the outcome 116 reds vs 104 blacks "116–104" is currently unknown to science but a 12 unit win is still a win so stopping is a prudent option whether or not it is known to be optimal.

For the outcome of 16 reds vs 12 blacks (16–12), the decision is extremely close. Whilst for a run of 1 million spins the math fails to determine the optimal decision, whilst a run of 10 million spins shows that stopping is optimal.


The following table was computed for all possible outcomes up to 10 million spins.

The 3 columns are:
difference stop-with continue-with

1 1–0 2–1
2 5–3 6–4
3 9–6 10–7
4 16–12 17–13
5 23–18 24–19
6 32–26 33–27
7 42–35 43–36
8 54–46 55–47
9 67–58 68–59
10 82–72 83–73
11 98–87 99–88
12 115–103 117–105
13 134–121 135–122
14 155–141 156–142
15 176–161 177–162
16 199–183 201–185
17 224–207 225–208
18 250–232 251–233
19 277–258 279–260
20 306–286 307–287
21 336–315 338–317
22 368–346 369–347
23 401–378 402–379
24 435–411 437–413
25 471–446 473–448
26 508–482 510–484
≥ 27 stop

The above table and analysis from Rigorous computer analysis of the Chow-Robbins game