๐Ÿ“•
PRML
  • PRML study
  • Chapter-1 Introduction
    • 1.1 Example: Polynomial Curve Fitting
    • 1.2 Probability Theory
    • 1.3 Model Selection
    • 1.4 The Curse of dimensionality
    • 1.5 Decision Theory
    • 1.6 Information Theory
    • a.1 From Set Theory to Probability Theory
  • Chapter-2 Probability Distributions
    • 2.1 Binary Variables
    • 2.2 Multinomial Variables
    • 2.3 Gaussian Distribution
    • 2.4 The Exponential Family
    • 2.5 Nonparametric Methods
  • Chapter-3 Linear Models
    • 3.1 Linear Basis Function Models
    • 3.2 The Bias-Variance Decomposition
    • 3.3 Bayesian Linear Regression
    • 3.4 Bayesian Model comparison
    • 3.5 The Evidence Approximation
    • 3.6 Limitations of Fixed Basis Functions
  • Chapter-4 Linear Models for Classification
    • 4.1 Discriminant Functions
    • 4.2 Probabilistic Generative Models
    • 4.3 Probabilistic Discriminative Models
    • 4.4 The Laplace Approximation
    • 4.5 Bayesian Logistic Regression
  • Chapter-5 Neural Networks
    • 5.1 Feed-forward network Function
    • 5.2 Network Training
    • 5.3 Error Backpropagation
    • 5.4 The Hessian Matrix
    • 5.5 Regularization in Neural Networks
    • 5.6 Mixture Density Networks
    • 5.7 Bayesian Neural Networks
Powered by GitBook
On this page
  • Set Theory
  • Function
  • Measure Theory
  • Probability Theory
  • Random Variable

Was this helpful?

  1. Chapter-1 Introduction

a.1 From Set Theory to Probability Theory

Previous1.6 Information TheoryNextChapter-2 Probability Distributions

Last updated 5 years ago

Was this helpful?

Set Theory

์ตœ์„ฑ์ค€๋‹˜์˜ ์ž๋ฃŒ๋ฅผ ๋งŽ์ด ์ฐธ๊ณ ํ–ˆ์Šต๋‹ˆ๋‹ค.

์ง‘ํ•ฉ๋ก (set theory)์€ ์ถ”์ƒ์  ๋Œ€์ƒ๋“ค์˜ ๋ชจ์ž„์ธ ์ง‘ํ•ฉ์„ ์—ฐ๊ตฌํ•˜๋Š” ์ˆ˜ํ•™ ์ด๋ก ์ด๋‹ค. ๊ธฐ๋ณธ์ ์ธ ๊ฐœ๋…์€ ์œ„ํ‚ค๋งํฌ๋ฅผ ๋‹ฌ์•„ ๋‘์—ˆ๋‹ค.

  • : ํŠน์ • ์กฐ๊ฑด์— ๋งž๋Š” ์›์†Œ๋“ค์˜ ๋ชจ์ž„

  • : ์ง‘ํ•ฉ์„ ์ด๋ฃจ๋Š” ๊ฐœ์ฒด, ์›์†Œ aaa๊ฐ€ ์ง‘ํ•ฉ AAA์— ์†ํ•  ๊ฒฝ์šฐ aโˆˆAa \in AaโˆˆA๋ผ๊ณ  ํ‘œ๊ธฐํ•œ๋‹ค.

  • : ์ง‘ํ•ฉ A์˜ ๋ชจ๋“  ์›์†Œ๊ฐ€ ๋‹ค๋ฅธ ์ง‘ํ•ฉ B์—๋„ ์†ํ•˜๋Š” ๊ด€๊ณ„์ผ ๊ฒฝ์šฐ, A๋Š” B์˜ "๋ถ€๋ถ„ ์ง‘ํ•ฉ"์ด๋ผ๊ณ  ํ•œ๋‹ค.

  • : ๋ชจ๋“  ๋Œ€์ƒ(์ž๊ธฐ ์ž์‹ ๋„ ํฌํ•จ)์„ ์›์†Œ๋กœ ํฌํ•จํ•˜๋Š” ์ง‘ํ•ฉ

    • : ๊ฐ ์ง‘ํ•ฉ์˜ ์›์†Œ๋ฅผ ๊ฐ ์„ ๋ถ„์œผ๋กœ ํ•˜๋Š” ํŠœํ”Œ(tuple)๋“ค์˜ ์ง‘ํ•ฉ

      Aร—B={(a,b):aโˆˆA,bโˆˆB}A \times B = \{ (a, b): \mathtt{a} \in A, \mathtt{b} \in B\}Aร—B={(a,b):aโˆˆA,bโˆˆB}

      • ์˜ˆ์‹œ: A={1,2},B={3,4,5}โ†’Aร—B={(1,3),(1,4),(1,5),(2,3),(2,4),(2,5)}A = \{ 1, 2 \}, B = \{ 3, 4, 5 \} \rightarrow A \times B = \{ (1,3), (1,4), (1,5), (2,3), (2,4), (2,5) \}A={1,2},B={3,4,5}โ†’Aร—B={(1,3),(1,4),(1,5),(2,3),(2,4),(2,5)}

  • : ๊ณตํ†ต ์›์†Œ๊ฐ€ ์—†๋Š” ๋‘ ์ง‘ํ•ฉ, AโˆฉB=โˆ…A \cap B = \emptysetAโˆฉB=โˆ…

  • : ์ง‘ํ•ฉ์˜ ์›์†Œ๋“ค์„ ๋น„๊ณต ๋ถ€๋ถ„ ์ง‘ํ•ฉ๋“ค์—๊ฒŒ ๋‚˜๋ˆ ์ฃผ์–ด, ๋ชจ๋“  ์›์†Œ๊ฐ€ ๊ฐ์ž ์ •ํ™•ํžˆ ํ•˜๋‚˜์˜ ๋ถ€๋ถ„ ์ง‘ํ•ฉ์— ์†ํ•˜๊ฒŒ๋” ํ•˜๋Š” ๊ฒƒ

    • ์˜ˆ์‹œ: A={1,2,3,4}โ†’partitionย ofย setย A={{1,2},{3},{4}}A = \{ 1, 2, 3, 4 \} \rightarrow \text{partition of set A} = \{ \{1, 2\}, \{3\}, \{4\} \}A={1,2,3,4}โ†’partitionย ofย setย A={{1,2},{3},{4}}

  • : ์ฃผ์–ด์ง„ ์ง‘ํ•ฉ์˜ ๋ชจ๋“  ๋ถ€๋ถ„ ์ง‘ํ•ฉ๋“ค๋กœ ๊ตฌ์„ฑ๋œ ์ง‘ํ•ฉ(the set of all the subsets)

    • ์˜ˆ์‹œ: A={1,2,3}โ†’powerย setย ofย 2A={โˆ…,{1},{2},{3},{1,2},{2,3},{1,3},{1,2,3}}A = \{ 1, 2, 3 \} \rightarrow \text{power set of 2}^A = \{ \emptyset, \{1\},\{2\},\{3\},\{1,2\},\{2,3\},\{1,3\},\{1,2,3\} \}A={1,2,3}โ†’powerย setย ofย 2A={โˆ…,{1},{2},{3},{1,2},{2,3},{1,3},{1,2,3}}

  • : ์ง‘ํ•ฉ์˜ "์›์†Œ ๊ฐœ์ˆ˜"์— ๋Œ€ํ•œ ์ฒ™๋„, โˆฃAโˆฃ\vert A \vertโˆฃAโˆฃ๋กœ ํ‘œ๊ธฐ ํ•œ๋‹ค. ์ง‘ํ•ฉ์˜ ํฌ๊ธฐ๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ์šฉ์–ด๋กœ finite, infinite, countable, uncountable, denumerable(countably infinite)๊ฐ€ ์žˆ๋‹ค.

    • : ๊ด€์‹ฌ์žˆ๋Š” ์ง‘ํ•ฉ๊ณผ ์ž์—ฐ์ˆ˜์˜ ์ง‘ํ•ฉ์œผ๋กœ (one-to-one function)๊ด€๊ณ„๊ฐ€ ์กด์žฌํ•˜๋ฉด, ๊ทธ ์ง‘ํ•ฉ์€ ๊ฐ€์‚ฐ ์ง‘ํ•ฉ์ด๋‹ค. ํŠนํžˆ, ์ž์—ฐ์ˆ˜, ์ •์ˆ˜, ์œ ๋ฆฌ์ˆ˜์™€ ๊ฐ™์ด ์…€์ˆ˜ ์žˆ๋Š” ๋ฌดํ•œ ์ง‘ํ•ฉ์˜ ๊ฒฝ์šฐ, ๊ฐ€์‚ฐ ๋ฌดํ•œ(countable infinite)์ด๋‚˜ ๊ฐ€๋ถ€๋ฒˆ ์ง‘ํ•ฉ(denumerable set)์ด๋ผ๊ณ  ํ•œ๋‹ค.

    • ๋น„๊ฐ€์‚ฐ ์ง‘ํ•ฉ(uncountable set): ๊ฐ€์‚ฐ ์ง‘ํ•ฉ์ด ์•„๋‹Œ ์ง‘ํ•ฉ, ์‹ค์ˆ˜๋Š” ๋น„๊ฐ€์‚ฐ ์ง‘ํ•ฉ

Function

  • ๋ฐ˜๋Œ€๋กœ codomain์˜ ์›์†Œ์— ๋Œ€์‘ํ•˜๋Š” domain์˜ ์›์†Œ๋ฅผ ์—ญ์ƒ(inverse image)์ด๋ผ๊ณ  ํ•œ๋‹ค(์›์†Œ์˜ ์—ญ์ƒ์€ ๋ถ€๋ถ„ ์ง‘ํ•ฉ์ด๋ผ๋Š” ๊ฒƒ์„ ์ฃผ์˜).

  • one-to-one ์กฐ๊ฑด๊ณผ onto ์กฐ๊ฑด์„ ๋ชจ๋‘ ๋งŒ์กฑํ•˜๋ฉด ๊ฐ€์—ญ ํ•จ์ˆ˜(invertible function)๋ผ๊ณ  ํ•œ๋‹ค.

Measure Theory

Probability Theory

  • ํ™•๋ฅ ์„ ์ด์•ผ๊ฐ€ ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ž„์˜์  ์‹คํ—˜(random experiment)๋ฅผ ์ž˜ ์ •์˜ ํ•ด์•ผํ•œ๋‹ค.

์ด์ œ ํ™•๋ฅ ์˜ ๋ช…ํ™•ํ•œ ์ •์˜๋ฅผ ๋‚ด๋ ค๋ณธ๋‹ค.

  • ์‚ฌ์‹ค์ƒ ์ธก๋„์˜ ์ •์˜์—์„œ 2, 4๋ฒˆ ํ•ญ๋ชฉ์ด ์ถ”๊ฐ€๋œ ๊ฒƒ์ด๋‹ค. ์ฆ‰, ํ™•๋ฅ ์€ ํ‘œ๋ณธ ๊ณต๊ฐ„์—์„œ ์ •์˜๋œ ์ธก๋„(measure) ํ˜น์€ set function ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๊ฒ ๋‹ค.

ํ™•๋ฅ  ๊ธฐํƒ€ ๋ถ€๋ถ„

    • ์˜ˆ์‹œ:

Random Variable

  • ์ด์‚ฐ ํ™•๋ฅ  ๋ณ€์ˆ˜()

1.2.0.1

: ์ฒซ ๋ฒˆ์งธ ์ง‘ํ•ฉ์˜ ์ž„์˜์˜ ํ•œ ์›์†Œ๋ฅผ ๋‘ ๋ฒˆ์งธ ์ง‘ํ•ฉ์˜ ์˜ค์ง ํ•œ ์›์†Œ์— ๋Œ€์‘์‹œํ‚ค๋Š” ์ดํ•ญ ๊ด€๊ณ„์ด๋‹ค. ์ž…๋ ฅ์ด ๋˜๋Š” ์ง‘ํ•ฉ UUU๋ฅผ ์ •์˜์—ญ(domain), ์ถœ๋ ฅ์œผ๋กœ ๋Œ€์‘๋˜๋Š” ์ง‘ํ•ฉ VVV๋ฅผ ๊ณต์—ญ(codomain)์ด๋ผ๊ณ  ํ•œ๋‹ค.

f:Udomainโ†’Vcodomainf: \underset{domain}{U} \rightarrow \underset{codomain}{V}f:domainUโ€‹โ†’codomainVโ€‹

: domain์˜ ์›์†Œ(ํ˜น์€ ๋ถ€๋ถ„ ์ง‘ํ•ฉ)๊ฐ€ ๋Œ€์‘ํ•˜๋Š” codomain์˜ ์›์†Œ(ํ˜น์€ ์ง‘ํ•ฉ)

f(x)โˆˆV,xโˆˆUorf(A)={f(x)โˆฃxโˆˆA}โІV,AโІUf(x) \in V, x \in U \quad \text{or} \quad f(A) = \{ f(x) \vert x \in A \} \subseteq V, A \subseteq Uf(x)โˆˆV,xโˆˆUorf(A)={f(x)โˆฃxโˆˆA}โІV,AโІU

fโˆ’1(y)={xโˆˆUโˆฃf(x)โˆˆV}โІVorfโˆ’1(B)={xโˆฃf(x)โˆˆB}โІU,BโІVf^{-1}(y) = \{ x \in U \vert f(x) \in V \} \subseteq V \quad \text{or} \quad f^{-1}(B) = \{ x \vert f(x) \in B \} \subseteq U, B \subseteq Vfโˆ’1(y)={xโˆˆUโˆฃf(x)โˆˆV}โІVorfโˆ’1(B)={xโˆฃf(x)โˆˆB}โІU,BโІV

: ํ•จ์ˆ˜์˜ ๋ชจ๋“  ์ถœ๋ ฅ๊ฐ’์˜ ์ง‘ํ•ฉ, ์น˜์—ญ์€ ๊ณต์—ญ(codomain)์˜ ๋ถ€๋ถ„ ์ง‘ํ•ฉ์ด๋‹ค.

1.2.0.2

: domain์˜ ์„œ๋กœ ๋‹ค๋ฅธ ์›์†Œ๋ฅผ codimain์˜ ์„œ๋กœ ๋‹ค๋ฅธ ์›์†Œ๋กœ ๋Œ€์‘์‹œํ‚ค๋Š” ํ•จ์ˆ˜

: domain๊ณผ range๊ฐ€ ์ผ์น˜ํ•˜๋Š” ํ•จ์ˆ˜

์ด๋ž€ ํŠน์ • ๋ถ€๋ถ„ ์ง‘ํ•ฉ์— ๋Œ€ํ•ด ์ผ์ข…์˜ "ํฌ๊ธฐ"๋ฅผ ๋ถ€์—ฌํ•˜๋ฉฐ, ๊ทธ ํฌ๊ธฐ๋ฅผ ๊ฐ€์‚ฐ๊ฐœ๋กœ ์ชผ๊ฐœ์–ด ๊ฒŒ์‚ฐํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” ํ•จ์ˆ˜๋‹ค. ์ธก๋„๊ฐ€ ๋ถ€์—ฌ๋œ ์ง‘ํ•ฉ์„ ์ธก๋„ ๊ณต๊ฐ„(measure space)๋ผ๊ณ  ํ•˜๋ฉฐ, ์ด๋ฅผ ์—ฐ๊ตฌํ•˜๋Š” ์ˆ˜ํ•™ ๋ถ„์•ผ๋ฅผ ์ธก๋„๋ก (measure theory)๋ผ๊ณ  ํ•œ๋‹ค.

๊ธฐ๋ณธ์ ์œผ๋กœ ์ „์ฒด์ง‘ํ•ฉ(universial set) UUU๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ์ธก๋„(measure)๋Š” UUU์˜ ๋ถ€๋ถ„์ง‘ํ•ฉ(subset)์— ๋น„์Œ์ˆ˜์ธ ์‹ค์ˆ˜๋ฅผ ํ• ๋‹นํ•œ๋‹ค. ์šฐ์„  ๋ช…ํ™•ํžˆ measure๋ฅผ ์ •์˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ•„์š”ํ•œ ๊ฒƒ๋“ค์„ ์ •์˜ํ•ด๋ณธ๋‹ค.

: ์ง‘ํ•ฉ(set)์— ๋Œ€ํ•ด ์–ด๋–ค ์ˆซ์ž๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ํ•จ์ˆ˜(ex, cardinality, length, area), ์ฆ‰ ์ž…๋ ฅ์„ ์ง‘ํ•ฉ, ์ถœ๋ ฅ์€ ์ˆซ์ž๊ฐ€ ๋˜๋Š” ํ•จ์ˆ˜

: ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” ์ „์ฒด์ง‘ํ•ฉ UUU์˜ ๋ถ€๋ถ„ ์ง‘ํ•ฉ ๋ชจ์ŒB\mathcal{B}B๋ฅผ ฯƒ\sigmaฯƒ-field ๋ผ๊ณ  ํ•œ๋‹ค(ฯƒโˆ’algebra\sigma-\text{algebra}ฯƒโˆ’algebra์™€ ๊ฐ™์€ ๋ง).

โˆ…โˆˆB\emptyset \in \mathcal{B}โˆ…โˆˆB, empty set is included

BโˆˆBโ‡’BcโˆˆBB \in \mathcal{B} \Rightarrow B^{c} \in \mathcal{B}BโˆˆBโ‡’BcโˆˆB, closed under set complement

BiโˆˆBโ‡’โ‹ƒi=1โˆžBiโˆˆBB_i \in \mathcal{B} \Rightarrow \bigcup_{i=1}^{\infty}B_i \in \mathcal{B}Biโ€‹โˆˆBโ‡’โ‹ƒi=1โˆžโ€‹Biโ€‹โˆˆB, closed under countable union

ฯƒ\sigmaฯƒ-field๋Š” measure๋ฅผ ๋ถ€์—ฌํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์†Œ ๋‹จ์œ„๊ฐ€ ๋œ๋‹ค. ๋งŒ์•ฝ ์–ด๋–ค ์›์†Œ๊ฐ€ ฯƒ\sigmaฯƒ-field์— ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค๋ฉด, ๊ทธ ์›์†Œ๋Š” ์ธก์ •ํ•  ์ˆ˜ ์—†๋‹ค.

ฯƒ\sigmaฯƒ-field ํŠน์„ฑ

UโˆˆBU \in \mathcal{B}UโˆˆB

BiโˆˆBโ‡’โ‹‚i=1โˆžBiโˆˆBB_i \in \mathcal{B} \Rightarrow \bigcap_{i=1}^{\infty}B_i \in \mathcal{B}Biโ€‹โˆˆBโ‡’โ‹‚i=1โˆžโ€‹Biโ€‹โˆˆB, closed under countable intersection

2U2^U2U, power set of U ๋Š” ๊ฐ€์žฅ ๋‹จ์œ„๊ฐ€ ์ž์ž˜์ž์ž˜ ํ•˜๊ฒŒ ๋งŒ๋“  ฯƒ\sigmaฯƒ-field

B\mathcal{B}B ๋Š” ์œ ํ•œํ•˜๊ฑฐ๋‚˜ ๋น„๊ฐ€์‚ฐ ๋‘˜ ์ค‘ ํ•˜๋‚˜๋‹ค, ๊ฐ€์‚ฐ ๋ฌดํ•œ/๊ฐ€๋ฒˆ๋ถ€(countable infinite/denumerable)๊ฐ€ ๋  ์ˆ˜ ์—†๋‹ค.

B,Cย areย ฯƒ-fieldโ‡’BโˆฉCย isย ฯƒ-field,ย butย BโˆชCย isย not\mathcal{B}, \mathcal{C} \text{ are } \sigma \text{-field} \Rightarrow \mathcal{B} \cap \mathcal{C} \text{ is } \sigma \text{-field, but } \mathcal{B} \cup \mathcal{C} \text{ is not}B,Cย areย ฯƒ-fieldโ‡’BโˆฉCย isย ฯƒ-field,ย butย BโˆชCย isย not

: ๊ฐ„๋‹จํžˆ ๋งํ•ด์„œ, ์–ด๋–ค ์ง‘ํ•ฉ UUU๊ฐ€ ์žˆ๊ณ  ๊ทธ ์ง‘ํ•ฉ์˜ ๋ถ€๋ถ„์ง‘ํ•ฉ์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ฯƒ\sigmaฯƒ-field์— measure๋ฅผ ๋ถ€์—ฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ณต๊ฐ„ (U,B)(U, \mathcal{B})(U,B)

๋ฅผ ์ •์˜ํ•˜๊ธฐ ์œ„ํ•œ ์ค€๋น„๋Š” ๋‹ค ๋˜์—ˆ๋‹ค. ์ •์˜๋ฅผ ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

measure ฮผ\muฮผ๋Š” ๊ฐ€์ธก ๊ณต๊ฐ„(measureable space)-(U,B)(U, \mathcal{B})(U,B)์—์„œ ์ •์˜๋œ set function, ฮผ:Bโ†’[0,โˆž]\mu: \mathcal{B}\rightarrow [0, \infty]ฮผ:Bโ†’[0,โˆž] ์ด๋‹ค.

ฮผ(โˆ…)=0\mu(\emptyset) = 0ฮผ(โˆ…)=0

For disjoint BiB_iBiโ€‹ and Bjโ‡’ฮผ(โ‹ƒi=1โˆžBi)=โˆ‘i=1โˆžฮผ(Bi)B_j \Rightarrow \mu(\bigcup_{i=1}^{\infty}B_i) = \sum_{i=1}^{\infty} \mu(B_i)Bjโ€‹โ‡’ฮผ(โ‹ƒi=1โˆžโ€‹Biโ€‹)=โˆ‘i=1โˆžโ€‹ฮผ(Biโ€‹), countable addivitity

์ฆ‰, ๊ฐ€์ธก ๊ณต๊ฐ„(measurable space)-(U,B)(U, \mathcal{B})(U,B)๊ณผ measure ฮผ\muฮผ๊ฐ€ ํ•˜๋‚˜์˜ ์ธก๋„ ๊ณต๊ฐ„(measure space)-(U,B,ฮผ)(U, \mathcal{B}, \mu)(U,B,ฮผ) ๋ฅผ ๊ตฌ์„ฑํ•˜๊ฒŒ ๋œ๋‹ค.

1.2.0.2

๊ทธ๋ฆผ 1.2.0.2์—์„œ ฮฉ\Omegaฮฉ๋Š” ํ‘œ๋ณธ ๊ณต๊ฐ„(sample space)์ด๋ผ๊ณ  ํ•œ๋‹ค. ํ‘œ๋ณธ ๊ณต๊ฐ„์—์„œ ์ •์˜๋˜๋Š” ์ธก๋„(measure)๋Š” ๋Œ€๋ฌธ์ž P๋กœ ์ž‘์„ฑํ•œ๋‹ค. ๋ฌด์Šจ ๋œป์ธ์ง€๋Š” ๋‹ค์Œ์„ ๊ณ„์† ์ฝ์–ด๋ณธ๋‹ค.

๋Š” ์ž„์˜์  ์‹คํ—˜์—์„œ ๋ฐœ์ƒํ•˜๋ฉฐ ๋”์ด์ƒ ๋‚˜๋ˆŒ์ˆ˜ ์—†๋Š” ๋ชจ๋“  ๊ฐ€๋Šฅ์„ฑ ์žˆ๋Š” ํ˜„์ƒ๋“ค์„ ์ผ์ปซ๋Š” ๋ง์ด๋‹ค.

์€ ํ™•๋ฅ ์ด ๋ถ€์—ฌ๋œ ์ž„์˜์  ์‹คํ—˜์—์„œ ๋ฐœ์ƒํ•œ ๊ฒฐ๊ณผ(outcomes)์˜ ์ง‘ํ•ฉ์ด๋ฉฐ, ํ‘œ๋ณธ ๊ณต๊ฐ„(sample space)์˜ ๋ถ€๋ถ„ ์ง‘ํ•ฉ์ด๋‹ค.

www๋Š” ํ‘œ๋ณธ ๊ณต๊ฐ„(sample space)์—์„œ ์ž„์˜์  ์‹คํ—˜์„ ํ†ตํ•ด ๋‚˜์˜ฌ ์ˆ˜ ์žˆ๋Š” ๊ฒฐ๊ณผ(outcome)๋ฅผ ๋งํ•œ๋‹ค.

์€ ๋ชจ๋“  sample point ์˜ ์ง‘ํ•ฉ์ด๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, ๊ณต์ •ํ•œ ์ •์œก๋ฉด์ฒด ์ฃผ์‚ฌ์œ„๋ฅผ ๋žœ๋ค์œผ๋กœ ๋˜์ง€๋Š” ์‹คํ—˜์ด ์žˆ๋‹ค(random experiment). ๊ฒฐ๊ณผ(outcomes)๋กœ ํ•œ ๋ฉด์— 1~6๊นŒ์ง€ ์ˆซ์ž๊ฐ€ ๋ณด์ธ๋‹ค. 7์€ ๋‚˜์˜ฌ ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๊ด€์ฐฐ ๊ฐ€๋Šฅํ•œ ๊ฒฐ๊ณผ(outcome)์ด ์•„๋‹ˆ๋‹ค. ๊ทธ๋ฆผ 1.2.0.2 ์˜ ๊ฐ ์ ๋“ค๋กœ ํ‘œํ˜„๋˜์–ด ์žˆ๋‹ค. ์ด ๊ทธ๋ฆผ์€ ๋ชจ๋“  ์ ๋“ค์ด ํ‘œ๋ณธ ๊ณต๊ฐ„ ฮฉ\Omegaฮฉ ๋‚ด์— ์ •์˜ ๋˜์–ด ์žˆ์Œ์œผ๋กœ, ๋ชจ๋“  ์ ๋“ค์€ sample point์ด์ž ์ด ์ž„์˜์  ์‹คํ—˜์˜ ๊ฒฐ๊ณผ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ "์ฃผ์‚ฌ์œ„๋ฅผ ๊ตด๋ ธ์„ ๋•Œ, ๋ณด์ด๋Š” ๋ฉด์ด ์ง์ˆ˜ ์ธ ๊ฒฝ์šฐ", ์ฆ‰ A๋กœ ํ‘œ๊ธฐ๋œ ฮฉ\Omegaฮฉ์˜ ๋ถ€๋ถ„ ์ง‘ํ•ฉ์€ ์‚ฌ๊ฑด(event)์ด๋‹ค.

ํ™•๋ฅ  PPP ๋Š” ๊ฐ€์ธก ๊ณต๊ฐ„(measureable space)-(ฮฉ,A)(\Omega, \mathcal{A})(ฮฉ,A) ์—์„œ ์ •์˜๋˜๋Š” set function P:Aโ†’[0,1]P : \mathcal{A} \rightarrow [0, 1]P:Aโ†’[0,1] ์ธ๋ฐ ๋‹ค์Œ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•œ๋‹ค(๊ธฐํ˜ธ๊ฐ€ ์•ฝ๊ฐ„ ๋‹ค๋ฅธ๋ฐ, A\mathcal{A}A๋Š” ฯƒ\sigmaฯƒ-field, ์ผ๋ฐ˜ ๋Œ€๋ฌธ์ž AAA๋Š” ฯƒ\sigmaฯƒ-field์˜ ๋ถ€๋ถ„ ์ง‘ํ•ฉ์ž„์œผ๋กœ ์ž˜ ๊ตฌ๋ถ„ํ•ด์•ผ ํ•จ).

P(โˆ…)=0P(\emptyset) = 0P(โˆ…)=0

P(A)โ‰ฅ0,โˆ€AโІฮฉP(A) \geq 0, \forall A \subseteq \OmegaP(A)โ‰ฅ0,โˆ€AโІฮฉ

For disjoint sets AiA_iAiโ€‹ and Ajโ‡’ฮผ(โ‹ƒi=1โˆžBi)=โˆ‘i=1โˆžฮผ(Bi)A_j \Rightarrow \mu(\bigcup_{i=1}^{\infty}B_i) = \sum_{i=1}^{\infty} \mu(B_i)Ajโ€‹โ‡’ฮผ(โ‹ƒi=1โˆžโ€‹Biโ€‹)=โˆ‘i=1โˆžโ€‹ฮผ(Biโ€‹), countable addivitity

P(ฮฉ)=1P(\Omega) = 1P(ฮฉ)=1

์ง€๊ธˆ๊นŒ์ง€ ํ™•๋ฅ ์€ ๊ฐ€์ธก ๊ณต๊ฐ„์—์„œ ์ •์˜๋œ ๊ฒƒ์ด๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋–ค ์‚ฌ๊ฑด AAA์— ์–ด๋–ป๊ฒŒ ํ™•๋ฅ ์„ ๋ถ€์—ฌํ• ๊นŒ? ํ•ด๋‹ต์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ์ž„์˜์  ์‹คํ—˜์—์„œ ๋‚˜์˜จ ๊ฒฐ๊ณผ๋กœ ๊ตฌ์„ฑ๋œ ํ‘œ๋ณธ ๊ณต๊ฐ„ ฮฉ\Omegaฮฉ๊ฐ€ ์žˆ๊ณ , ๊ทธ ํ‘œ๋ณธ ๊ณต๊ฐ„์—์„œ ๋ฐœ์ƒํ•œ ์‚ฌ๊ฑด AAA์— ํ•ด๋‹นํ•˜๋Š” ํ™•๋ฅ ์„ ๋ถ€์—ฌํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ ํ™•๋ฅ  ํ• ๋‹น ํ•จ์ˆ˜(probability allocation function)์ด ๋“ฑ์žฅํ•œ๋‹ค.

probability mass function: ์ด์‚ฐ(discrete) ํ‘œ๋ณธ ๊ณต๊ฐ„ ฮฉ\Omegaฮฉ์ผ ๋•Œ, p:ฮฉโ†’[0,1]p: \Omega \rightarrow [0, 1]p:ฮฉโ†’[0,1] such that โˆ‘wโˆˆฮฉp(w)=1\sum_{w\in \Omega} p(w)=1โˆ‘wโˆˆฮฉโ€‹p(w)=1 and P(A)=โˆ‘wโˆˆAp(w)P(A) = \sum_{w \in A} p(w)P(A)=โˆ‘wโˆˆAโ€‹p(w)

probability density function: ์—ฐ์†(continuous) ํ‘œ๋ณธ ๊ณต๊ฐ„ ฮฉ\Omegaฮฉ์ผ ๋•Œ, p:ฮฉโ†’[0,โˆž)p: \Omega \rightarrow [0, \infty)p:ฮฉโ†’[0,โˆž) such that โˆซwโˆˆฮฉf(w)dw=1\int_{w\in \Omega} f(w)dw=1โˆซwโˆˆฮฉโ€‹f(w)dw=1 and P(A)=โˆซwโˆˆAf(w)dwP(A) = \int_{w \in A} f(w)dwP(A)=โˆซwโˆˆAโ€‹f(w)dw

์กฐ๊ฑด๋ถ€ ํ™•๋ฅ (conditional probability) P(AโˆฃB)โ‰œP(AโˆฉB)P(B)P(A\vert B) \triangleq \dfrac{P(A \cap B)}{P(B)}P(AโˆฃB)โ‰œP(B)P(AโˆฉB)โ€‹

ํ™•๋ฅ ์˜ ์—ฐ์‡„ ๋ฒ•์น™(chain rule): P(AโˆฉB)=P(AโˆฃB)P(B)P(A \cap B) = P(A \vert B) P(B)P(AโˆฉB)=P(AโˆฃB)P(B)

์ „์ฒด ํ™•๋ฅ ์˜ ๋ฒ•์น™(total probability law): P(A)=P(AโˆฉB)+P(AโˆฉBc)=P(AโˆฃB)P(B)+P(AโˆฃBc)P(Bc)P(A) = P(A \cap B) + P(A \cap B^c) = P(A \vert B) P(B) + P(A \vert B^c) P(B^c)P(A)=P(AโˆฉB)+P(AโˆฉBc)=P(AโˆฃB)P(B)+P(AโˆฃBc)P(Bc)

๋ฒ ์ด์ฆˆ ์ •๋ฆฌ(Bayes' rule): P(BโˆฃA)=P(BโˆฉA)P(A)=P(AโˆฉB)P(A)=P(AโˆฃB)P(B)P(A)P(B \vert A) = \dfrac{P(B \cap A)}{P(A)} = \dfrac{P(A \cap B)}{P(A)} = \dfrac{P(A \vert B)P(B)}{P(A)}P(BโˆฃA)=P(A)P(BโˆฉA)โ€‹=P(A)P(AโˆฉB)โ€‹=P(A)P(AโˆฃB)P(B)โ€‹

P(AโˆฃB)P(A \vert B)P(AโˆฃB): likelihood

P(BโˆฃA)P(B \vert A)P(BโˆฃA): posterior

P(B)P(B)P(B): prior

๋…๋ฆฝ ์‚ฌ๊ฑด(independent events): P(AโˆฉB)=P(A)P(B)P(A \cap B) = P(A) P(B)P(AโˆฉB)=P(A)P(B) ๋งŒ ๋งŒ์กฑํ•˜๋ฉด independentํ•œ ๊ฒƒ์ด๋‹ค(โ‰ \neq๎€ = disjoint, mutually exclusive)

๋Š” ์ธก์ •๊ฐ€๋Šฅํ•œ(measureable) -(ฮฉ,A,P)(\Omega, \mathcal{A}, P)(ฮฉ,A,P)๊ณผ -(R,B)(\Bbb{R}, \mathcal{B})(R,B)์—์„œ ์ •์˜๋˜๋Š” ํ•จ์ˆ˜๋‹ค.

X:ฮฉโ†’Rย suchย thatย โˆ€BโˆˆB,Xโˆ’1(B)โˆˆAX: \Omega \rightarrow \Bbb{R} \text { such that } \forall B \in \mathcal{B}, X^{-1}(B) \in \mathcal{A}X:ฮฉโ†’Rย suchย thatย โˆ€BโˆˆB,Xโˆ’1(B)โˆˆA

์—ฌ๊ธฐ์„œ ๋žœ๋ค(random)์ด๋ž€ ํ™•๋ฅ  ๊ณต๊ฐ„์˜ ํ‘œ๋ณธ ๊ณต๊ฐ„(sample space, ฮฉ\Omegaฮฉ)์—์„œ ํ•˜๋‚˜๋ฅผ ์ž„์˜๋กœ ๋ฝ‘๋Š” ๊ณผ์ •์„ ๊ฐ€๋ฅดํ‚จ๋‹ค. ๊ทธ๋ฆผ 1.2.0.5์™€ ๊ฐ™์ด "์ˆซ์ž 4๊ฐ€ ๊ด€์ธก๋œ๋‹ค"๋ผ๋Š” ๊ฒƒ์„ ํ’€์–ด์„œ ์ด์•ผ๊ธฐํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ํ™•๋ฅ  ๊ณต๊ฐ„์˜ ํ‘œ๋ณธ ๊ณต๊ฐ„์—์„œ ์ž„์˜๋กœ ๋ฝ‘์€ ํ‘œ๋ณธ{4}๋ฅผ ํ™•๋ฅ  ๋ณ€์ˆ˜(XXX)์— ์ž…๋ ฅํ–ˆ์„ ๋•Œ, ์‹ค์ˆ˜ ๊ณต๊ฐ„(R\Bbb{R}R)์— ํ•ด๋‹นํ•˜๋Š” ์ˆซ์ž๊ฐ’ 4๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ๊ณผ์ •์ด๋‹ค.

์ƒ(image)
์น˜์—ญ(range)
์ผ๋Œ€์ผ ํ•จ์ˆ˜/๋‹จ์‚ฌ ํ•จ์ˆ˜(one-to-one/injective)
์œ„๋กœ์˜ ํ•จ์ˆ˜/์ „์‚ฌ ํ•จ์ˆ˜(onto/surjective)
์ธก๋„(measure)
set function
์ธก๋„(measure)
๊ฒฐ๊ณผ(outcomes)
์‚ฌ๊ฑด(event)
ํ‘œ๋ณธ ๊ณต๊ฐ„(sample space) ฮฉ\Omegaฮฉ
probability allocation function
ํ™•๋ฅ  ๋ฐ€๋„ ํ•จ์ˆ˜(Probability density function)
์ƒ๊ด€๋ถ„์„(Correlation analysis)
๋ฒ ์ด์ง€์•ˆ ๋”ฅ๋Ÿฌ๋‹
์ง‘ํ•ฉ(set)
์›์†Œ(element)
๋ถ€๋ถ„ ์ง‘ํ•ฉ(subset)
์ „์ฒด์ง‘ํ•ฉ(universal set)
์ง‘ํ•ฉ์˜ ์—ฐ์‚ฐ(set operations)
ํ•ฉ์ง‘ํ•ฉ(Unions)
๊ต์ง‘ํ•ฉ(Intersections)
์—ฌ์ง‘ํ•ฉ(Complements)
๊ณฑ์ง‘ํ•ฉ(product set, Cartesian product)
์„œ๋กœ์†Œ ์ง‘ํ•ฉ(disjoint set)
์ง‘ํ•ฉ์˜ ๋ถ„ํ• (partition of a set)
๋ฉฑ์žกํ•ฉ(power set of set A, โ‰”2A\coloneqq 2^A:=2A)
์ง‘ํ•ฉ์˜ ํฌ๊ธฐ(Cardinality)
๊ฐ€์‚ฐ ์ง‘ํ•ฉ(countable set)
์ผ๋Œ€์ผ ํ•จ์ˆ˜
ํ•จ์ˆ˜/์‚ฌ์ƒ(function/mapping)
ฯƒ\sigmaฯƒ-field B\mathcal{B}B
๊ฐ€์ธก ๊ณต๊ฐ„(measurable space)
ํ‘œ๋ณธ(sample point)
ํ™•๋ฅ  ๋ณ€์ˆ˜(Random Variable)
ํ™•๋ฅ  ๊ณต๊ฐ„(Probability space)
๋ณด๋  ๊ฐ€์ธก ๊ณต๊ฐ„(Borel measureable space, ๋ณดํ†ต ์‹ค์ˆ˜๋“ค์˜ ์ง‘ํ•ฉ์„ ๊ฐ€๋ฅดํ‚ด)
1.2.0.4
1.2.0.5