ST 302 Stochastic Processes

1 Information and Conditioning

1.1 Sigma-Algebras and Filtrations

Let us recall that a -algebra is a collection F of subsets of the sample space

such that

(i) ;,

2 F

(ii) if A 2 F then also Ac 2 F (here Ac denotes the complementary set to A)

(iii) if we have a sequence A1, A2, ... of sets in F , then [1i=1Ai 2 F .

We say that a collection of sets generates a -algebra F if F is the smallest

-algebra which contains all the sets. The trivial -algebra is the one containing

only ; and

.

Interpretation: We perform a random experiment. The realized outcome is an

element ! of the sample space

. Assume that we are given some partial infor-

mation in form of a -algebra F . If F does not contain all subsets of

, we might

not know the precise !, but we may narrow down the possibilities. We know with

certainty whether a set A 2 F either contains ! or does not contain it ?these sets

are resolved by our given information.

Example 1.1 We toss a coin three times, with the result of each toss being H

(head) or T (tail). Our time set consists of t = 0; 1; 2; 3. The sample space

is

the set of all eight possible outcomes. At time 0 (just before the ?rst coin toss),

1

we only know that the true ! does not belong to ; and does belong to

, hence

we set

F0 = f;;

g :

At time 1, after the ?rst coin toss, in addition the following two sets are resolved:

AH = fHHH;HHT;HTH;HTTg

AT = fTHH; THT; TTH; TTTg :

We can say with certainty whether ! belongs to AH or AT . In contrast, the

information about the ?rst coin toss only is not enough to determine with certainty

whether ! is contained e.g. in fHHH;HHTg ?for this we would have to wait

until the second coin toss. As the complement of AH is AT , and the union of AH

and AT equals

, we set

F1 = f;;

; AH ; ATg :

At time 2, after the second coin toss, in addition to the sets already contained in

F1, the sets

AHH = fHHH;HHTg ; AHT = fHTH;HTTg ;

ATH = fTHH; THTg ; ATT = fTTH; TTTg

get resolved, together with their complements and unions. Altogether, we get

F2 =

;;

; AH ; AT ; AHH ; AHT ; ATH ; ATT ; AcHH ; AcHT ; AcTH ; AcTT ;

AHH [ ATH ; AHH [ ATT ; AHT [ ATH ; AHT [ ATT

:

At time 3, after the third coin toss, we know the true realization of !, and can

therefore tell for each subset of

whether ! is a member or not. Hence

F3 = The set of all subsets of

:

As we have F0 F1 F2 F3 these four -algebras are said to form a ?ltration.

The random variable

X = number of tails among the rst two coin tosses

is said to be F2-measurable, but not F1-measurable (as we cannot determine the

value of X after only one coin toss).

2

Consider now a discrete time stochastic process X = (X0; X1; X2; :::), i.e. a

collection of random variables indexed by time. Hence X is a function of a chance

parameter ! and a time parameter n. We would write Xn(!) for a function

value, but typically suppress the chance parameter, otherwise the notation gets

too heavy. Moreover, the two variables play quite di¤erent roles, since the chance

parameter ! comes from the sample space

(which can be a very large set, with

no natural ordering) whereas the time parameter n is an element of the ordered

set N+.

We denote with Fn a family of sets containing all information about X up

to time n. In more detail, Fn is the -algebra generated by sets of the form (we

assume that the starting value X0 is just a deterministic number)

fX1 = i1; X2 = i2; :::; Xn = ing

if the state space is discrete. If the state space is R then Fn is the -algebra

generated by sets of the form

fX1 2 (a1; b1) ; X2 2 (a2; b2) ; :::; Xn 2 (an;bn)g

with intervals (a1; b1), (a2; b2), ... We have

F0 = f;;

g (at time zero, we know nothing)

F0 F1 F2 ::: Fn (the more time evolves, the more we know)

(Fn) = (F0;F1;F2; :::) is called a ltration. Sometimes we write

FXn to specify

that we are collecting information about the stochastic process X.

We say a random variable H is Fn-measurable if it depends on information

about X up to time n only. For example, if f is some function, then f (Xn) is

Fn-measurable; and so is max1inXi. Or, in other words, H is Fn-measurable if

it is a function of X1,...,Xn.

We say that a stochastic process Y = (Yn)n0 is adapted to (Fn) if each Yn is

Fn-measurable.

FYn is called the ltration generated by Y if it is the smallest

ltration where Y is adapted to. Summing up,

The random variable H is measurable with respect to a -algebra F if the

information in F is su￠ cient to determine H, and F can even contain more

information than just about H.

The stochastic process X is adapted to a ?ltration (Fn) if the information in

(Fn) is su￠ cient to determine all the Xns, and (Fn) can even contain more

info.

FXn contains just the information about all the Xn?s. More precisely,

the -algebra FXn contains exactly all the information about X0, X1, ..., Xn.

3

1.2 Conditional Expectation

We consider a -algebra F and want to make a prediction about some random

variable H based on the information contained in F . If H is F-measurable, then

we can precisely determineH from the given information in F . IfH is independent

of F , then the information in F is of no help in predicting the outcome of H. In

the intermediary case, we can try to use the information in F to make an educated

guess?about H, without being able to completely evaluate H.

You cannot avoid this section. Its material is absolutely essential for every-

thing that follows. To quote Thomas Mikosch, "The notion of conditional ex-

pectation is one of the most di￠ cult ones in probability theory, but it is also one

of the most powerful tools... For its complete theoretical understanding, measure-

theoretic probability theory is unavoidable." However, it is perfectly possible, and

not even di￠ cult, to get an operational understanding of conditional expectation.

You have just to grasp the following intuitive properties, they will be given for the

forementioned reason without proof.

De?nition 1.2 If E [H2] < 1, then the conditional expectation of H given F ,

written E [H j F ], is the least-squares-best F-measurable predictor of H: amongst

all F-measurable random variables bH (i.e. amongst all predictors which can be

computed from available information), it minimizes

E

bH H2 :

It turns out that one can de?ne (by the partial averaging property below) the

conditional expectation also in the more general case where we only assume that

E [jHj] < 1 (so E [H2] < 1 need not be satis?ed). We give a list of properties

of E [H j F ]. Here A denotes the indicator function of the set A: A (!) = 1 if

! 2 A, A (!) = 0 if ! =2 A.

(a) (Measurability) E [H j F ] is F-measurable

(the conditional expectation is a predictor based on available information)

(b) (Partial Averaging) For every set A 2 F ,

E [AE [H j F ]] = E [AH]

(on every set A 2 F , the conditional expectation of H given F has the same

expectation asH itself). In particular (takeA =

),

E [E [H j F ]] = E [H]

4

Properties (a) and (b) characterize the random variable E [H j F ] uniquely (mod-

ulo null sets).

(c) (Linearity) E [a1H1 + a2H2j F ] = a1E [H 1j F ] + a2E [H2 j F ]

(d) (Positivity) If H 0, then E [H j F ] 0

(e) (Jensen?s Inequality) If f is a convex function such that

E [jf(H)j] <1, then

f (E [H j F ]) E [f (H) j F ]

(f) (Tower Property) If G is a sub -algebra of F (contains less information

than F) then

E [E [H j F ] j G] = E [H j G] :

In case we deal with a ?ltration, this property has the following form: for two

time points s t (hence Fs Ft) we have

E [E [H j Ft] j Fs] = E [H j Fs] :

(nested conditional expectations are evaluated by taking only one conditional

expectation with respect to the earliest time point)

(g) (Taking out what is known) If G is a random variable which is F-

measurable, and such that E [jGHj] <1, then

E [GH j F ] = GE [H j F ]

In particular (choose H = 1),

E [G j F ] = G

(if G does only depend on available information then we can predict G fully)

(h) (R?le of independence) If H is independent of F , then

E [H j F ] = E [H]

(in that case, the information in F is useless in predicting H)

Let X be any random variable. We will sometimes use the notation E [H jX] if

we predict H by using the information about X only (more formally, we condition

with respect to the -algebra generated by X). Moreover, if A

we write

P (Aj F) for E [Aj F ]. An important special case of (h) is if we consider the

trivial -algebra F0 = f;;

g. In that case, the conditional expectation with

respect to F0 is just the common expectation (so the result is a number):

E [H j F0] = E [H] :

5

2 Martingales in Discrete Time

2.1 De?nition

Let X = (Xn)n0 be a discrete time process with E [jXnj] < 1 for all n, and let

(Fn) be some ?ltration. We assume that X is adapted to (Fn), that is, each Xn

is Fn-measurable.

De?nition 2.1 X is called a martingale (relative to (Fn), P ) if for all n 1

E [Xnj Fn1] = Xn1;

a submartingale if

E [Xnj Fn1] Xn1;

and a supermartingale if

E [Xnj Fn1] Xn1:

Interpretation: let Xn be the wealth at time n of a player engaged in some casino

game. The martingale property, rewritten as

E [ (Xn Xn1)j Fn1] = 0

(here we have used properties (c): Linearity and (g): Taking out what is known

of the conditional expectation), tells us that, based on all the information about

the games progress so far, one can predict that our increase in wealth will be on

average zero: we are playing a fair game. In reality, our wealth will be described in

most games by a supermartingale: there is a tendency to loose on average. There

are some games like Black Jack, however, where you can make a very clever use of

past events (like memorizing the cards which have already been taken out of the

deck) to achieve that your wealth process evolves like a submartingale: you are

engaged in a favourable game.

We will later on introduce the notion of Markov processes: these are stochastic

processes where future predictions depend on the past only via the present state,

they are memoryless. A martingale need not be a Markov process (because future

outcomes may depend on the whole past) and a Markov process need not be a

martingale (since it may be related to an unfair game).

Exercise 2.2 Prove the following statements by applying the properties of

conditional expectation. State in each step which property you use. All random

variables are assumed to be su￠ ciently integrable.

6

Fix some time horizon T 2 N0, and let H be a FT -measurable random

variable. De?ne a stochastic process X for n = 0; 1; 2; :::; T as

Xn = E [Hj Fn] :

Show that X is a martingale: check that X is adapted and that it ful?lls

the martingale property.

Let f be a convex function and X a martingale. Show that f (X) is a

submartingale and f(X) a supermartingale.

A stochastic process X is called predictable if for each n we have that Xn

is Fn1-measurable. Show that every predictable process is (Fn)-adapted,

and that every predictable martingale is constant.

Let Y be a predictable process, and X be a martingale. Show that then the

martingale transform Y X, de?ned as ((Y X)0 = 0 and)

(Y X)n =

nX

i=1

Yi (Xi Xi1)

is a martingale as well.

2.2 Discrete-Time Examples

Example 2.3 Sums of independent zero-mean RVs (random variables). Let

X1, X2,... be a sequence of independent RVs with E [jXkj] <1, 8k, and

E [Xk] = 0; 8k:

Dene (S0 := 0 and)

Sn := X1 +X2 + :::+Xn:

Let (Fn) =

FXn , the ?ltration which is generated by X. Note that Sn is Fn-

measurable (depends only on (X1; X2; :::; Xn)). Then we have for n 1

E [Snj Fn1] = E [Sn1j Fn1] + E [Xnj Fn1] by linearity (c)

= Sn1 + E [Xnj Fn1] taking out what is known (g)

= Sn1 + E [Xn] by independence (h)

= Sn1:

This proves that S = (Sn) is a martingale.

7

Example 2.4 Products of non-negative independent RVs of mean 1. Let

Y1, Y2,... be a sequence of independent non-negative RVs with

E [Yn] = 1; 8n:

De?ne (M0 := 1 and)

Mn := Y1Y2:::Yn:

Let (Fn) =

FYn , the ?ltration which is generated by Y . Then, for n 1, we

have

E [Mnj Fn1] = E [Mn1Ynj Fn1]

= Mn1E [Ynj Fn1] taking out what is known (g)

= Mn1E [Yn] by independence (h)

= Mn1:

This proves that M is a martingale.

Example 2.5 Exponential martingale. Let X1, X2,... be a sequence of inde-

pendent and identically distributed RVs with moment-generating function

' () = E [exp (Xi)] <1:

We set (S0 := 0 and) Sn = X1 + ::: +Xn. Let (Fn) =

FXn , the ?ltration which

is generated by X.

Then

Mn =

exp(Sn)

' ()n

is a martingale. This follows from the previous example, just put

Yk =

exp (Xk)

' ()

:

Example 2.6 Exponential martingale for random walk. In the setting of

the previous example, denote the distribution of the Xns as X where

P (X = +1) = P (X = 1) = 1

2

:

The stochastic process S = (Sn)n0 is called a symmetric random walk. Then

' () = E [exp (X)] =

版权所有：留学生编程辅导网 2020 All Rights Reserved 联系方式：QQ:99515681 电子信箱：99515681@qq.com

免责声明：本站部分内容从网络整理而来，只供参考！如有版权问题可联系本站删除。