Concentration inequality via MGF and Markov's inequality (Chernoff method)
The question asks to derive a tail probability bound by applying Markov's inequality to the exponential moment E(e^{tX}), optimizing over the parameter t.
The interval $I$ and the function $\varphi_{X}$ are defined as in question I.A.4. We suppose that $X$ satisfies $(C_{\tau})$ for some $\tau > 0$, is not almost surely constant, and $a > E(X)$. Show that, for $n$ in $\mathbb{N}^{*}$ and $t$ in $I \cap \mathbb{R}^{+}$ $$E\left(\mathrm{e}^{tS_{n}}\right) = \left(\varphi_{X}(t)\right)^{n}, \quad P\left(S_{n} \geqslant na\right) \leqslant \frac{\varphi_{X}(t)^{n}}{\mathrm{e}^{nta}}$$
The interval $I$ and the function $\varphi_{X}$ are defined as in question I.A.4. We suppose that $X$ satisfies $(C_{\tau})$ for some $\tau > 0$, is not almost surely constant, and $a > E(X)$. We define the function $\chi : \begin{aligned} & I \rightarrow \mathbb{R} \\ & t \mapsto \ln\left(\varphi_{X}(t)\right) - ta \end{aligned}$ a) Show that the function $\chi$ is bounded below on $I \cap \mathbb{R}^{+}$. We denote by $\eta_{a}$ the infimum of $\chi$ on $I \cap \mathbb{R}^{+}$. b) Give an equivalent of $\chi(t)$ as $t$ tends to 0. Deduce that $\eta_{a} < 0$. c) Show $\forall n \in \mathbb{N}^{*}, \quad P\left(S_{n} \geqslant na\right) \leqslant \mathrm{e}^{n\eta_{a}}$. Deduce that $\gamma_{a} < 0$. d) In each of the following two cases, determine the set of real numbers $a$ satisfying the conditions $P(X \geqslant a) > 0$ and $a > E(X)$; then, for $a$ satisfying these conditions, calculate $\eta_{a}$. i. $X$ follows the Bernoulli distribution $\mathcal{B}(p)$ with $0 < p < 1$. ii. $X$ follows the Poisson distribution $\mathcal{P}(\lambda)$ with $\lambda > 0$.
We suppose here that the infimum $\eta_{a}$ of the function $\chi$ on $I \cap \mathbb{R}^{+}$ is attained at a point $\sigma$ interior to $I \cap \mathbb{R}^{+}$. Let $t$ be a real number interior to $I$ and such that $t > \sigma$, $b$ be a real number such that $b > \frac{\varphi_{X}^{\prime}(t)}{\varphi_{X}(t)}$. a) Calculate $\sum_{x \in X(\Omega)} \frac{\mathrm{e}^{tx}}{E\left(\mathrm{e}^{tX}\right)} P(X = x)$. We then admit (if necessary by modifying $(\Omega, \mathcal{A}, P)$) that there exists a random variable $X^{\prime}$ on $(\Omega, \mathcal{A})$ such that $X^{\prime}(\Omega) = X(\Omega)$ and whose probability distribution is given by $$\forall x \in X(\Omega), \quad P\left(X^{\prime} = x\right) = \frac{\mathrm{e}^{tx}}{E\left(\mathrm{e}^{tX}\right)} P(X = x)$$ and that there exists a sequence $\left(X_{n}^{\prime}\right)_{n \in \mathbb{N}^{*}}$ of mutually independent random variables defined on $(\Omega, \mathcal{A}, P)$ all following the same distribution as $X^{\prime}$. b) Show $$E\left(X^{\prime}\right) = \frac{\varphi_{X}^{\prime}(t)}{\varphi_{X}(t)}, \quad E\left(X^{\prime}\right) > a$$
We suppose here that the infimum $\eta_{a}$ of the function $\chi$ on $I \cap \mathbb{R}^{+}$ is attained at a point $\sigma$ interior to $I \cap \mathbb{R}^{+}$. Let $t$ be a real number interior to $I$ and such that $t > \sigma$, $b$ be a real number such that $b > \frac{\varphi_{X}^{\prime}(t)}{\varphi_{X}(t)}$. We admit that, if $n$ in $\mathbb{N}^{*}$ and if $f$ is a map from $X(\Omega)^{n}$ to $\mathbb{R}^{+}$, we have $$E\left(f\left(X_{1}^{\prime}, \ldots, X_{n}^{\prime}\right)\right) = \frac{E\left(f\left(X_{1}, \ldots, X_{n}\right) \mathrm{e}^{tS_{n}}\right)}{\varphi_{X}(t)^{n}}$$ a) For $n$ in $\mathbb{N}^{*}$, we set $S_{n}^{\prime} = \sum_{k=1}^{n} X_{k}^{\prime}$. Show $P\left(na \leqslant S_{n}^{\prime} \leqslant nb\right) \leqslant P\left(S_{n} \geqslant na\right) \frac{\mathrm{e}^{ntb}}{\varphi_{X}(t)^{n}}$. $$\text{We may introduce the map } f : \left|\, \begin{array}{cl} X(\Omega)^{n} & \rightarrow \mathbb{R} \\ \left(x_{1}, \ldots, x_{n}\right) & \mapsto \begin{cases} 1 & \text{if } na \leqslant \sum_{i=1}^{n} x_{i} \leqslant nb \\ 0 & \text{otherwise} \end{cases} \end{array} \right.$$ b) Using questions I.B.2, II.B.2c and a) above, finally show that $\eta_{a} = \gamma_{a}$.
In this question you may use the results from II.B.2d. a) Let $\alpha$ be in $]0, 1/2[$. For $n$ in $\mathbb{N}^{*}$, we set $$A_{n} = \left\{k \in \{0, \ldots, n\}, \left|k - \frac{n}{2}\right| \geqslant \alpha n\right\}, \quad U_{n} = \sum_{k \in A_{n}} \binom{n}{k}$$ Determine the limit of the sequence $\left(U_{n}^{1/n}\right)_{n \geqslant 1}$. b) Let $\lambda$ be in $\mathbb{R}^{+*}$, $\alpha$ be in $]\lambda, +\infty[$. For $n$ in $\mathbb{N}^{*}$, we set $$T_{n} = \sum_{\substack{k \in \mathbb{N} \\ k \geqslant \alpha n}} \frac{n^{k} \lambda^{k}}{k!}$$ Determine the limit of the sequence $\left(T_{n}^{1/n}\right)_{n \geqslant 1}$.
In subsection II.C, we consider $\varepsilon$ a strictly positive real, $X$ a discrete real random variable taking values in $\left\{x_{p}, p \in \mathbb{N}\right\}$, and $\left(X_{k}\right)_{k \in \mathbb{N}^{*}}$ a sequence of random variables that are mutually independent and have the same distribution as $X$. For every strictly positive integer $n$, we define the random variable $S_{n}$ by $S_{n}=\sum_{k=1}^{n} X_{k}$. We assume that the random variable $X$ admits an exponential moment of order $\alpha$ where $\alpha$ is a strictly positive real. The function $\Psi: t \mapsto \mathbb{E}\left(\mathrm{e}^{tX}\right)$ is defined on $[-\alpha, \alpha]$, and $f_{\varepsilon}(t) = \mathrm{e}^{-(m+\varepsilon)t}\Psi(t)$. a) Let $t$ be a real belonging to the interval $]0, \alpha]$ and let $n$ belong to $\mathbb{N}^{*}$. Show that $\mathbb{P}\left(\frac{S_{n}}{n} \geqslant m+\varepsilon\right)=\mathbb{P}\left(\mathrm{e}^{t S_{n}} \geqslant\left(\mathrm{e}^{t(m+\varepsilon)}\right)^{n}\right)$, then that $\mathbb{P}\left(\frac{S_{n}}{n} \geqslant m+\varepsilon\right) \leqslant\left(f_{\varepsilon}(t)\right)^{n}$. b) Deduce that there exists a real $r$ belonging to the interval $]0,1[$ such that $\forall n \in \mathbb{N}^{*}, \mathbb{P}\left(\frac{S_{n}}{n} \geqslant m+\varepsilon\right) \leqslant r^{n}$.
In subsection II.C, we consider $\varepsilon$ a strictly positive real, $X$ a discrete real random variable taking values in $\left\{x_{p}, p \in \mathbb{N}\right\}$, and $\left(X_{k}\right)_{k \in \mathbb{N}^{*}}$ a sequence of random variables that are mutually independent and have the same distribution as $X$. For every strictly positive integer $n$, we define the random variable $S_{n}$ by $S_{n}=\sum_{k=1}^{n} X_{k}$. We assume that the random variable $X$ admits an exponential moment of order $\alpha$ where $\alpha$ is a strictly positive real, and $m = \mathbb{E}(X)$. Show that the sequence defined by: $\forall n \in \mathbb{N}^{*}, \mathbb{P}\left(\left|\frac{S_{n}}{n}-m\right| \geqslant \varepsilon\right)$ is bounded above by a sequence with limit zero and whose convergence rate is geometric. Compare this result to the upper bound obtained with the weak law of large numbers.
In subsection II.D, we assume that there exists a strictly positive real number $c$ such that the discrete real random variable $X$ satisfies $\mathbb{E}(X)=0$ and $\forall \omega \in \Omega,|X(\omega)| \leqslant c$. For every strictly positive integer $n$, $S_{n}=\sum_{k=1}^{n} X_{k}$ where $\left(X_{k}\right)$ are mutually independent with the same distribution as $X$. Show that $\forall n \in \mathbb{N}^{*}, \mathbb{P}\left(\left|\frac{S_{n}}{n}\right| \geqslant \varepsilon\right) \leqslant 2 \exp\left(-n \frac{\varepsilon^{2}}{2c^{2}}\right)$.
We fix $p, q \in [0,1]$ and $(X_i)_{1 \leq i \leq n}$ a family of $n$ random variables taking values in $\{0,1\}$ mutually independent Bernoulli random variables with parameter $p$. We set $S_n = \sum_{i=1}^n X_i$. We assume in this question that $p < q$. a. Justify that $$\mathbb{P}\left(\left|\frac{S_n}{n} - q\right| \leq \left|\frac{S_n}{n} - p\right|\right) = \mathbb{P}\left(S_n \geq \frac{p+q}{2}n\right)$$ b. Let $X$ be a Bernoulli random variable with parameter $p$. For $u > 0$, calculate $\mathbb{E}\left(e^{uX}\right)$. c. Show that for all $u > 0$, $$\mathbb{P}\left(S_n \geq \frac{p+q}{2}n\right) \leq e^{-n\left(\frac{p+q}{2}u - \ln\left(1-p+pe^u\right)\right)}$$ Hint. One may assume that if $(Z_i)_{1 \leq i \leq n}$ are $n$ mutually independent random variables taking a finite number of values, then $\mathbb{E}\left(\prod_{i=1}^n Z_i\right) = \prod_{i=1}^n \mathbb{E}\left(Z_i\right)$. d. Show that $\mathbb{P}\left(S_n \geq \frac{p+q}{2}n\right) \leq e^{-n\frac{(p-q)^2}{2}}$.
Prove Theorem 3: We fix $p, q \in [0,1]$. Let $n \geq 1$ be an integer and let $S_n$ be a sum of $n$ random variables taking values in $\{0,1\}$ mutually independent Bernoulli random variables with parameter $p$. Then $$\mathbb{P}\left(\left|\frac{S_n}{n} - q\right| \leq \left|\frac{S_n}{n} - p\right|\right) \leq e^{-n\frac{(p-q)^2}{2}}.$$
Let $n \geqslant 1$ be a natural integer, and let $(X_1, \ldots, X_n)$ be discrete real random variables that are mutually independent such that, for all $k \in \{1, \ldots, n\}$, $$P[X_k = 1] = P[X_k = -1] = \frac{1}{2}$$ We define $$S_n = \frac{1}{n} \sum_{k=1}^{n} X_k$$ as well as, for all $\lambda \in \mathbb{R}$, $$\psi(\lambda) = \log\left(\frac{1}{2}e^{\lambda} + \frac{1}{2}e^{-\lambda}\right)$$ For each $\lambda \geqslant 0$, we set $$m(\lambda) = \frac{E[X_1 \exp(\lambda X_1)]}{E[\exp(\lambda X_1)]}$$ as well as $$D_n(\lambda) = \exp(\lambda n S_n - n \psi(\lambda))$$ For all $n \geqslant 1$, $\lambda \geqslant 0$ and $\varepsilon > 0$, we denote by $I_n(\lambda, \varepsilon)$ the random variable defined by $$I_n(\lambda, \varepsilon) = \begin{cases} 1 & \text{if } |S_n - m(\lambda)| \leqslant \varepsilon \\ 0 & \text{otherwise.} \end{cases}$$ (a) Deduce, for each $\lambda \geqslant 0$ and $\varepsilon > 0$, the existence of a sequence $(u_n(\varepsilon))_{n \geqslant 1}$ that tends to 0 as $n$ tends to infinity and such that $$\frac{1}{n} \log P[S_n \geqslant m(\lambda) - \varepsilon] \geqslant \psi(\lambda) - \lambda m(\lambda) - \lambda \varepsilon + u_n(\varepsilon)$$ (b) Conclude that for all $t \in [0,1[$, $$\lim_{n \rightarrow \infty} \frac{1}{n} \log P[S_n \geqslant t] = \inf_{\lambda \geqslant 0} (\psi(\lambda) - \lambda t).$$ (c) Is the preceding formula still valid for $t = 1$?
Let $n \geqslant 1$ be a natural integer, and let $\left( X _ { 1 } , \ldots , X _ { n } \right)$ be mutually independent discrete real random variables such that, for all $k \in \{ 1 , \ldots , n \}$, $$P \left[ X _ { k } = 1 \right] = P \left[ X _ { k } = - 1 \right] = \frac { 1 } { 2 }$$ We define $$S _ { n } = \frac { 1 } { n } \sum _ { k = 1 } ^ { n } X _ { k }$$ as well as, for all $\lambda \in \mathbb { R }$, $$\psi ( \lambda ) = \log \left( \frac { 1 } { 2 } e ^ { \lambda } + \frac { 1 } { 2 } e ^ { - \lambda } \right)$$ For each $\lambda \geqslant 0$, we set $$m ( \lambda ) = \frac { E \left[ X _ { 1 } \exp \left( \lambda X _ { 1 } \right) \right] } { E \left[ \exp \left( \lambda X _ { 1 } \right) \right] }$$ as well as $$D _ { n } ( \lambda ) = \exp \left( \lambda n S _ { n } - n \psi ( \lambda ) \right)$$ For all $n \geqslant 1 , \lambda \geqslant 0$ and $\varepsilon > 0$, we denote by $I _ { n } ( \lambda , \varepsilon )$ the random variable defined by $$I _ { n } ( \lambda , \varepsilon ) = \begin{cases} 1 & \text { if } \left| S _ { n } - m ( \lambda ) \right| \leqslant \varepsilon \\ 0 & \text { otherwise } \end{cases}$$ (a) Deduce, for each $\lambda \geqslant 0$ and $\varepsilon > 0$, the existence of a sequence $\left( u _ { n } ( \varepsilon ) \right) _ { n \geqslant 1 }$ that tends to 0 as $n$ tends to infinity and such that $$\frac { 1 } { n } \log P \left[ S _ { n } \geqslant m ( \lambda ) - \varepsilon \right] \geqslant \psi ( \lambda ) - \lambda m ( \lambda ) - \lambda \varepsilon + u _ { n } ( \varepsilon )$$ (b) Conclude that for all $t \in [ 0,1 [$, $$\lim _ { n \rightarrow \infty } \frac { 1 } { n } \log P \left[ S _ { n } \geqslant t \right] = \inf _ { \lambda \geqslant 0 } ( \psi ( \lambda ) - \lambda t )$$ (c) Is the preceding formula still valid for $t = 1$ ?