Zach Laster University of Helsinki
University of Helsinki Probability The likelihood of something - - PowerPoint PPT Presentation
University of Helsinki Probability The likelihood of something - - PowerPoint PPT Presentation
Zach Laster University of Helsinki Probability The likelihood of something happening Statistics Models for predicting events based on previous occurrences Using these, we can estimate what will happen or how things will
Probability – The likelihood of something
happening
Statistics – Models for predicting events
based on previous occurrences
Using these, we can estimate what will
happen or how things will progress in a game
We can also use them to balance
probabilistic events, such as hit frequencies and gambling mini-games
Probabilities are not guesses
Rolling a d6 results in a 16.7% percent
chance of getting a 1. This is a fact.
○ Unless of course the die isn’t fair, but you get
the point.
Throwing a fair coin has a 50% chance to
land on either face.
Probabilities are facts. Things we know
to be true.
We just use them to make guesses
Independent & Related Events
An independent event happens the same
way every time regardless of how previous results went
Flip a coin again. Did the first one affect the
second?
How about a die? ;)
A related event affect the probability of later
events
Drawing a card from a deck obviously reduces
the chances of drawing that card from the deck
Standard example: I have a bag of red and blue
marbles…
Conditional Probability
Probabilities can be multiplied together
to find the chance of the events happening together
Getting two heads in a row = ½ * ½
This allows us to chain events We can also do this for related events
Chance of drawing two Queens from a deck
(without putting the first one back)
○ = 4/52 * 3/51
This multiplicative effect on decimal (or
rational) numbers obviously results in smaller and smaller chances
We can improve our odds slightly by
covering a larger range
Chance of drawing a heart from a deck = 13/52
= 1/4
Chance of drawing 4 hearts in a row
○ 13/52 * 12/51 * 11/50 * 10/49 = ~20%
Chance of rolling something higher than a 2 on
a d6 = 4/6
In Reverse
Sometimes, calculating the chances of
something happening is tricky
In these cases we can calculate the chance it
won’t happen, and subtract that from 100%
This gives us the probability it WILL happen. Magic!
As a trivial example, what are the odds you will
roll something other than a 6. Clearly this is the same as not rolling a 6, so we can just take the odds of rolling a 6 (1/6) and subtract that from 1
1- 1/6 = 5/6
So what are the odds of throwing a 6 in
six throws of a die?
Obviously, not 100% This is kind of unintuitive, but search your
feelings, you know it to be true
Actually, the easiest was to figure this
- ut is to do solve the reverse
What are the odds we won’t throw a 6 in six
throws?
The odds of not throwing a 6: 5/6 The odds of not throwing a 6 six times in
a row: 5/6 * 5/6 * 5/6 * 5/6 *5/6 * 5/6 = 33%
The odds of throwing a six, then, is
100% - 33% = 67%
Statistics
Statistics is a mathematical science pertaining to the
collection, analysis, interpretation, and presentation of
- data. It is applicable to a wide variety of academic
disciplines, from the physical and social sciences to the humanities; it is also used for making informed decisions in all areas of business and government. – Wikipedia.org
Statistics is a mathematical science that deals with
collecting and analyzing data in order to determine past trends, forecast future results, and gain a level of confidence about stuff that we want to know more about. – Tyler Sigman
Statistics can help you shine a flashlight upon your broken
mechanics and shattered design dreams. It does this by giving you actual hard, scientific data to support meaningful design decisions. – Tyler Sigman
Statistics is a weird math science thing
that can really get confusing
However, it is more useful than it has any
right to be!
Statistics is probably something you are
more familiar with than you realize
A Population is the entire collection of
everything we want to know something about
All the people online, all the people who play a
kind of game, all the people in Finland/Helsinki
A Sample is a subset (AH! Math!) of the
- population. We use this to gather data and
then make conclusions about the population at large.
We don’t perform the test on the entire
population because, seriously, you want to ask EVERYONE on the internet/in Helsinki a dozen questions?
Ideally, our sample size will be large. The
closer it is to the population size the better.
If you have a population of 10,000 and you ask
two people something, how well do you think that covered the entire population?
Of course, time and money simply don’t
allow us to poll every person ever, so we use samples
In digital games, we can actually embed the
polling into the game, so it automatically collects the data from every player! That’s actually a really amazing thing!
Distributions
Statistics has this nice tendency of
producing similar distributions
This feels like there’s a joke in here
somewhere
A distribution is basically a pattern which
statistical data follows
For instance, we tend to have a central
value which is common, and as we deviate from this value the probability of the new value drops.
The Normal Distribution
Also called the “Bell Curve” and the “Gaussian
Distribution”
Here the population is closely centered around the
mean or average value.
In addition to being focused on a mean, the
standard deviation and variance of a distribution are also worth note
Standard deviation is basically how far off
the norm values are on average
Some things will be further out, others will be
closer
An average of 3 minutes in a level with a
standard deviation (σ) of 30 seconds is pretty good
○ On average, you’ll take from 2.5 minutes to 3.5
minutes to complete the level, with a tendency towards 3 minutes.
Margin of Error
If our population size is bigger than our sample
size, then we have some margin of error
How far off we might be, given we didn’t include
every element in the population
One method of this is a confidence interval,
such as 95% certainty that something will hold true
Generally, “we can guarantee with A% confidence
that B% of the data will be between values C and D.” (Sigman Part 2)
In statistics, more data is king. Always and
forever, more data is better.
No such thing as certain
I tried to explain this to a lawyer once. It
didn’t go well.
Basically, you can’t reach a point where
you’ve tested every possible thing
This is why there will always be bugs in your
code
This is why you can’t actually rule out your
neighbor being an alien
But you can be reasonably certain (like
>90%) and that’s usually good enough
Go on, live a little. Who needs to be sure?
“Stop stealing my good rolls!”
Known as “The Gambler’s Fallacy” Humans are terrible (and I mean terrible) at
probability.
Really, we’re crap at it.
This leads to common misconceptions like “I
just rolled three 1s! Clearly the next roll won’t be a 1.”
Or “I’ve not rolled a 20 in a while. I’m due.”
No matter what has happened in the past, the
probability of rolling a 20 has not changed.
I don’t care if you haven’t rolled one tonight. Or this
- week. Or even this month. It’s still 1/20.
Most gamers actually KNOW that
probability doesn’t work that way
Smarter than your average gambler, we
Despite this, they still commit it
frequently
“Dude! My dice are hot tonight!” “Man, you stole one of my 20s!”
We’re that bad at probability.
Double Rares
Related to this is shock (and consternation) when
someone gets two rares in a row (particularly, someone else)
For instance, I draw two treasure items, both of them are
rare items
Lots of players will be really happy at their new
windfall
The players around them may not be so happy, and
feel the system is broken
“It just handed out two rares at once! That’s like two 1%
chances in a row!”
Actually, if we think about this, obviously this
SHOULD happen
1% * 1% = 0.01%, which isn’t likely, but it IS possible.
The Anti-law of Averages
The next standard error is that the
number of rolls will average.
If we think about this one, it’s silly, but it still
comes up
If we flip a coin 10 times and get an
uneven split (which is actually kind of likely), it’s not reasonable to believe that throwing the coin 10 more times will make the numbers balance out.
This is because probability is a percentage
Say we got an 8:2 split If we throw it 10 more times, we could
actually get the same split!
Statistically speaking, if we throw the coin a
million times, we’d expect the split to be about 50%
However, the actual number of heads vs tails
could be off by a huge amount.
In the long run, the difference in heads and tails
flips will probably actually grow, not shrink
Selection Bias
This one actually comes from the fact
that humans aren’t really good at recall, either
CogSci will actually tell us that we forget
things because we couldn’t function
- therwise
○ There are people who don’t, and they can’t
We actually forget bad things by design
○ So we don’t live in perpetual fear of door jams
So what does this mean for our perspective on
probability?
Well, clearly good things happen more often
○ At least, so our memory would tell us
So do rare events, oddly enough
That’s a fun one
We know the event was rare, so it sticks in our mind
when it happens
The fact that it sticks so well (makes such an
impression) actually makes us feel like it happens more often
This is why people are afraid of flying and not
driving, even though you are much more likely to die in a car accident
This also mean we think we are cooler/better/more
skilled than we actually are
While, yes, some of us have a hard time letting go of that
- ne time in 5th grade where we did that stupid thing in
front of everyone, overall we tend to remember the good things better
This means we remember winning, particularly winning
epically, more than we do losing
Players therefore think they are better than they are This can result in them overestimating their abilities and
getting into situations (higher tier games) they can’t win
We can fix this for example by auto-matchmaking or
dynamic difficulty
Self-Serving Bias
Players sometimes feel that they aren’t
winning as often as they think they should, given their odds.
They will also feel an unlikely event is
much more likely than it is overall (which ties back to our bad memories)
Thus they will complain about losing 25% of the
time when they have a 75% chance to win
But they will be fine with winning 25% when they
- nly have a 25% chance to win.
Attribution Bias
This is in some ways humans being kind of
immature, because we are
If the game rewards the player randomly,
then it was something they did
“Man, I made the right decision going this way.”
If it penalizes them randomly, it’s the
game’s fault
“This game is so not fair!” Or worse, another player’s fault!
Anchoring
Basically ,getting stuck on the first number Big numbers mean big chances, right?
Slot machines with big jackpots Chances presented in larger scales (2:1 vs
20:10)
Basically, people get caught up in the
number sounding or feeling big, and don’t realize it’s actually fairly small or the same
A player with a small base number and lots
- f bonuses may also misjudge the full
import fo those bonuses, thinking they are weaker than they are.
The Hot-Hand Fallacy
This is the idea that someone on a streak
is more likely to score/strike/win again
It largely originates form Basketball
Players would do well, and we’d term them as
being “on fire”
NBA Jam actually made this a mechanic, giving
a player on fire bonuses to speed and such
○ I loved this game, just for this mechanic. It made
you feel pretty awesome. It’s completely wrong, though
Probability theorists laughed it off and
figured this was sillyness
Basketball fans pointed to some things like
morale and flow states
Probability theorists sat down with some
statistics to figure out if it held
And everyone was wrong
Turns out, a player on a streak is more likely to
fail next time than succeed
This increased fail chance gets bigger the
longer the streak
We’re not actually sure what produces
this
Maybe getting cocky Maybe getting tired Maybe getting distracted
But it seems to be mostly psychological It also seems to affect most forms of
streak, at least when the player is aware
- f it
Problems?
The players will also blame you for some of their
fallacies
When the actual tendency for things to occur happens,
they will feel the game is off, and think it’s doing it wrong.
Sid Meier’s 2010 Keynote made this rather clear When told they have 2:1 odds, a player will actually
expect to win slightly more often than that
When told they gave 20:10 odds, they will expect to win
EVEN MORE often
○ It’s the same odds! ○ Once more, we’re terrible at expectations for odds
Players will actually expect the odds to play more in their
favor than what they are told.
Managing This
Skew the odds Adjust the odds internally so they ARE actually higher
than the presented odds
Limit random impact Reduce the power of random events Don’t let a single roll screw up a player (too badly) Show what caused the problem to the player and what
could be done to prevent that
Downplay streaks We can combat the Hot-Hand fallacy by downplaying the
value of streaks
Give actual mechanics bonuses to streaks Or we can make streaks actually give a positive feedback
loop
Ethical?
Lots of people will (reasonably) have
issues with the ethics of the first one, there
It’s probably wrong to endorse peoples’
misconceptions
There are some things we can bias without
being unethical though
ex: We can bias random drops so that if a player
hasn’t seen a rare lately, they get one
○ If we’re up front about this, no one will complain
(the players will love us, actually) and we’re in the clear for ethics concerns!
A better solution
Instead of biasing things or downplaying
random elements, we can provide the player with all the information about all the random rolls thus far
A table of all the outcomes/rolls will help a lot Actual calculated percentages will really set the
player at ease
○ When they can look and see that the system IS
being balanced and fair, they will typically accept that and feel much better about the game
Monty Hall and You
To be fair, probability is kind of weird. There was (is?) a game show called “Let’s make a
Deal” which included the so called “Monty Hall Problem”
Which is actually named after the host of the show at the
time…
There are three doors. Behind two of the doors there
is a goat, and the third has a shiny new car (or other desirable object. High-end computer?)
The player chooses as door. The host then opens one of the doors not chosen by
the player to reveal a goat.
The player is then given the chance to change doors.
Should they?
Most people go, “well, it should be
independent, right? Why would I change? I now have a 50% chance of being right.”
Which is false
Actually, you should change. Why?
Because there’s a 66% chance it’s the
- ther door.
Because probability is weird.
This isn’t intuitive. It’s not obvious. It fooled
Paul Erdős…
This only becomes apparent when you look
at a computer simulation of it, really.
In order to make this make sense, keep in
mind we are dealing with the probability from the start
The probability doesn’t actually change at any
- point. If you consider it from this point NOW you
are actually doing it wrong.
Think of it from the start and if the host
didn’t actually OPEN the door.
You actually have the option of staying with the
door you chose or taking BOTH of the other two doors.
You know one of those has a goat anyway, but
there’s a 2/3rds chance the car’s in one of them
Statistics Gone Wrong
It’s easy to simply throw out statements
and make invalid conclusions from them
“Team A hasn’t won a game against Team B
since 1982”
○ So what’s the probability of Team A beating
Team B? Probably not a lot to do with that statement.
○ So while this trivia is interesting from a
sportscasting perspective, it’s not statistically relevant.
Why the margin of error matters
“50% of people have been attacked by a shark in the
past week!”
○ If the MoE is around 50%, then clearly this isn’t a
worthwhile statement
○ If it’s closer to 2%, we’re in trouble, and I’m breaking
- ut the Jaws theme
Discarding valid results because you didn’t like
the results
Making models off of flawed (biased/fake) data
If your data doesn’t hold water, then neither will the
results.
Asking a loaded question obviously biases data!
It’s also easy to infer relationships that aren’t
there
If you find two trends, you can easily graph them and
say, “Hey, look, the number of pizzas ordered per day coincides with the number of personal computers bought since 2000”
○ Clearly, the pizzas aren’t buying computers nor the
- ther way around
○ “Correlation does not imply Causality!” ○ Caution: there may not be such a correlation, I made it
up (It just seemed reasonable) However, there may be a relationship between the
two.
○ Chances are, the two things ARE linked, but only by the
average wealth of people or something like that.
Become a Pirate, Save the Environment!
Luck vs Skill
Some games are 100% skill based
Chess, Go Most deterministic games
The opposite is possible
Chutes and ladders or Candyland
These don’t actually correlate as simply as
we might think with casual games or fun
Tic-Tac-Toe is deterministic, but not big on skill Poker is highly random, but we think of it as
requiring a lot of skill
Poker vs Blackjack
Blackjack is also a card game with
similar levels of randomness to Poker
But it’s definitely not the same level of skill
game (unless you are counting cards)
Poker gives you a lot more moves and
- ption than blackjack does, and allows
you to adapt to changes in the environment
You have far fewer options in Blackjack
Professional Sports
Sports games aren’t really random things. You
win or lose based on how good you are.
Thus we pay players a fixed amount based on their
skill
And yet we talk about how likely they are to
win and chances to score
Gamblers bet on these games as if they were
games of chance. Why’s this?
From their frame of reference, it is. They cannot influence the odds or the outcome of
the game, but they can estimate it based on past events
Action Games
Hidden miss chances are bad
If it looks like you will hit, you should
From this, we can actually extrapolate
that hidden negative chances will leave players unhappy
However, chances for random positive
events are better
Critical hits Headshots (ok, not strictly random, but they
happen randomly and work similarly)
Balanced Odds
We discussed loot drops as positive
rewards yesterday
We’ve also mentioned how we can skew their
frequency
We’ve also commented on how chances
for negative events tend to end up
One thing taken from D&D: Randomness
favors the underdog
The big dog could win without it, the underdog
has a larger chance of winning simply because they might get lucky
○ Two-face is wrong!
Value of something with a random chance
We also need to balance our items and
mechanics with non-deterministic elements so that they are fair
What are appropriate cost/benefit values for
something with a random chance?
We’ll actually cover this at the beginning
- f tomorrow.
Computers don’t really do random
Unless you have something really fancy built into it,
- f course, like something radioactive that decays at
random intervals…
Instead, computers use formulas to produce
numbers in a sequence
Wait! How is that at all random? It isn’t
The catch is that the sequence is built from a
starting number, which is kind of randomly selected
Common methods are to ask the user or use the
current time
When designing a game where random is
important, and there is a lot of random (online poker) we need to get more fancy
Especially in games with money riding on them.
People don’t like it if there are patterns in your random number generators. That can give people a way to cheat!
This means we need more entropy in our
- riginal seeds, and probably need to
change our seed more often
Most games set the seed once and never