lecture 22

This commit is contained in:
Andreaierardi 2020-05-26 10:27:52 +02:00
parent 687cff7f2f
commit 44250ae388
7 changed files with 595 additions and 0 deletions

View File

@ -0,0 +1,12 @@
\relax
\@nameuse{bbl@beforestart}
\babel@aux{english}{}
\@writefile{toc}{\contentsline {chapter}{\numberline {1}Lecture 22 - 26-05-2020}{1}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {section}{\numberline {1.1}Continous of Pegasos}{1}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.1}{\ignorespaces }}{1}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {1.2}Boosting and ensemble predictors }{2}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {1.2.1}Bagging}{4}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {1.2.2}Random Forest}{4}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.2}{\ignorespaces }}{4}\protected@file@percent }

View File

@ -0,0 +1,402 @@
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (MiKTeX 2.9.7300 64-bit) (preloaded format=pdflatex 2020.4.13) 26 MAY 2020 10:27
entering extended mode
**./lecture22.tex
(lecture22.tex
LaTeX2e <2020-02-02> patch level 5
L3 programming layer <2020-03-06>
("C:\Program Files\MiKTeX 2.9\tex/latex/subfiles\subfiles.cls"
Document Class: subfiles 2020/02/14 v1.6 Multi-file projects (class)
Preamble taken from file `../main.tex'
("C:\Program Files\MiKTeX 2.9\tex/latex/tools\verbatim.sty"
Package: verbatim 2019/11/10 v1.5r LaTeX2e package for verbatim enhancements
\every@verbatim=\toks15
\verbatim@line=\toks16
\verbatim@in@stream=\read2
)
("C:\Program Files\MiKTeX 2.9\tex/latex/import\import.sty"
Package: import 2020/04/01 v 6.2
) (../main.tex
("C:\Program Files\MiKTeX 2.9\tex/latex/base\report.cls"
Document Class: report 2019/12/20 v1.4l Standard LaTeX document class
("C:\Program Files\MiKTeX 2.9\tex/latex/base\size12.clo"
File: size12.clo 2019/12/20 v1.4l Standard LaTeX file (size option)
)
\c@part=\count167
\c@chapter=\count168
\c@section=\count169
\c@subsection=\count170
\c@subsubsection=\count171
\c@paragraph=\count172
\c@subparagraph=\count173
\c@figure=\count174
\c@table=\count175
\abovecaptionskip=\skip47
\belowcaptionskip=\skip48
\bibindent=\dimen134
)
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsmath.sty"
Package: amsmath 2020/01/20 v2.17e AMS math features
\@mathmargin=\skip49
For additional information on amsmath, use the `?' option.
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amstext.sty"
Package: amstext 2000/06/29 v2.01 AMS text
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsgen.sty"
File: amsgen.sty 1999/11/30 v2.0 generic functions
\@emptytoks=\toks17
\ex@=\dimen135
))
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsbsy.sty"
Package: amsbsy 1999/11/29 v1.2d Bold Symbols
\pmbraise@=\dimen136
)
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsopn.sty"
Package: amsopn 2016/03/08 v2.02 operator names
)
\inf@bad=\count176
LaTeX Info: Redefining \frac on input line 227.
\uproot@=\count177
\leftroot@=\count178
LaTeX Info: Redefining \overline on input line 389.
\classnum@=\count179
\DOTSCASE@=\count180
LaTeX Info: Redefining \ldots on input line 486.
LaTeX Info: Redefining \dots on input line 489.
LaTeX Info: Redefining \cdots on input line 610.
\Mathstrutbox@=\box45
\strutbox@=\box46
\big@size=\dimen137
LaTeX Font Info: Redeclaring font encoding OML on input line 733.
LaTeX Font Info: Redeclaring font encoding OMS on input line 734.
\macc@depth=\count181
\c@MaxMatrixCols=\count182
\dotsspace@=\muskip16
\c@parentequation=\count183
\dspbrk@lvl=\count184
\tag@help=\toks18
\row@=\count185
\column@=\count186
\maxfields@=\count187
\andhelp@=\toks19
\eqnshift@=\dimen138
\alignsep@=\dimen139
\tagshift@=\dimen140
\tagwidth@=\dimen141
\totwidth@=\dimen142
\lineht@=\dimen143
\@envbody=\toks20
\multlinegap=\skip50
\multlinetaggap=\skip51
\mathdisplay@stack=\toks21
LaTeX Info: Redefining \[ on input line 2859.
LaTeX Info: Redefining \] on input line 2860.
)
("C:\Program Files\MiKTeX 2.9\tex/latex/systeme\systeme.sty"
("C:\Program Files\MiKTeX 2.9\tex/latex/xstring\xstring.sty"
("C:\Program Files\MiKTeX 2.9\tex/generic/xstring\xstring.tex"
\integerpart=\count188
\decimalpart=\count189
)
Package: xstring 2019/02/06 v1.83 String manipulations (CT)
)
("C:\Program Files\MiKTeX 2.9\tex/generic/systeme\systeme.tex"
\SYS_systemecode=\toks22
\SYS_systempreamble=\toks23
\SYSeqnum=\count190
)
Package: systeme 2019/01/13 v0.32 Mise en forme de systemes d'equations (CT)
)
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\amssymb.sty"
Package: amssymb 2013/01/14 v3.01 AMS font symbols
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\amsfonts.sty"
Package: amsfonts 2013/01/14 v3.01 Basic AMSFonts support
\symAMSa=\mathgroup4
\symAMSb=\mathgroup5
LaTeX Font Info: Redeclaring math symbol \hbar on input line 98.
LaTeX Font Info: Overwriting math alphabet `\mathfrak' in version `bold'
(Font) U/euf/m/n --> U/euf/b/n on input line 106.
))
("C:\Program Files\MiKTeX 2.9\tex/latex/subfiles\subfiles.sty"
Package: subfiles 2020/02/14 v1.6 Multi-file projects (package)
)
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\babel.sty"
Package: babel 2020/02/28 3.41 The Babel package
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\switch.def"
File: switch.def 2020/02/28 3.41 Babel switching mechanism
)
*************************************
* Local config file bblopts.cfg used
*
("C:\Program Files\MiKTeX 2.9\tex/latex/arabi\bblopts.cfg"
File: bblopts.cfg 2005/09/08 v0.1 add Arabic and Farsi to "declared" options of
babel
)
("C:\Program Files\MiKTeX 2.9\tex/latex/babel-english\english.ldf"
Language: english 2017/06/06 v3.3r English support from the babel system
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\babel.def"
File: babel.def 2020/02/28 3.41 Babel common definitions
\babel@savecnt=\count191
\U@D=\dimen144
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\txtbabel.def")
\bbl@readstream=\read3
\bbl@dirlevel=\count192
)
Package babel Info: \l@canadian = using hyphenrules for english
(babel) (\language0) on input line 102.
Package babel Info: \l@australian = using hyphenrules for ukenglish
(babel) (\language72) on input line 105.
Package babel Info: \l@newzealand = using hyphenrules for ukenglish
(babel) (\language72) on input line 108.
))
("C:\Program Files\MiKTeX 2.9\tex/latex/xcolor\xcolor.sty"
Package: xcolor 2016/05/11 v2.12 LaTeX color extensions (UK)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics-cfg\color.cfg"
File: color.cfg 2016/01/02 v1.6 sample color configuration
)
Package xcolor Info: Driver file: pdftex.def on input line 225.
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics-def\pdftex.def"
File: pdftex.def 2018/01/08 v1.0l Graphics/color driver for pdftex
)
Package xcolor Info: Model `cmy' substituted by `cmy0' on input line 1348.
Package xcolor Info: Model `hsb' substituted by `rgb' on input line 1352.
Package xcolor Info: Model `RGB' extended on input line 1364.
Package xcolor Info: Model `HTML' substituted by `rgb' on input line 1366.
Package xcolor Info: Model `Hsb' substituted by `hsb' on input line 1367.
Package xcolor Info: Model `tHsb' substituted by `hsb' on input line 1368.
Package xcolor Info: Model `HSB' substituted by `hsb' on input line 1369.
Package xcolor Info: Model `Gray' substituted by `gray' on input line 1370.
Package xcolor Info: Model `wave' substituted by `hsb' on input line 1371.
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\dvipsnam.def"
File: dvipsnam.def 2016/06/17 v3.0m Driver-dependent file (DPC,SPQR)
))
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\graphicx.sty"
Package: graphicx 2019/11/30 v1.2a Enhanced LaTeX Graphics (DPC,SPQR)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\keyval.sty"
Package: keyval 2014/10/28 v1.15 key=value parser (DPC)
\KV@toks@=\toks24
)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\graphics.sty"
Package: graphics 2019/11/30 v1.4a Standard LaTeX Graphics (DPC,SPQR)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\trig.sty"
Package: trig 2016/01/03 v1.10 sin cos tan (DPC)
)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics-cfg\graphics.cfg"
File: graphics.cfg 2016/06/04 v1.11 sample graphics configuration
)
Package graphics Info: Driver file: pdftex.def on input line 105.
)
\Gin@req@height=\dimen145
\Gin@req@width=\dimen146
)
("C:\Program Files\MiKTeX 2.9\tex/latex/sectsty\sectsty.sty"
Package: sectsty 2002/02/25 v2.0.2 Commands to change all sectional heading sty
les
LaTeX Warning: Command \underbar has changed.
Check if current package is valid.
LaTeX Warning: Command \underline has changed.
Check if current package is valid.
) ("C:\Program Files\MiKTeX 2.9\tex/latex/framed\framed.sty"
Package: framed 2011/10/22 v 0.96: framed or shaded text with page breaks
\OuterFrameSep=\skip52
\fb@frw=\dimen147
\fb@frh=\dimen148
\FrameRule=\dimen149
\FrameSep=\dimen150
)
("C:\Program Files\MiKTeX 2.9\tex/latex/titlesec\titlesec.sty"
Package: titlesec 2019/10/16 v2.13 Sectioning titles
\ttl@box=\box47
\beforetitleunit=\skip53
\aftertitleunit=\skip54
\ttl@plus=\dimen151
\ttl@minus=\dimen152
\ttl@toksa=\toks25
\titlewidth=\dimen153
\titlewidthlast=\dimen154
\titlewidthfirst=\dimen155
)
("C:\Program Files\MiKTeX 2.9\tex/latex/base\fontenc.sty"
Package: fontenc 2020/02/11 v2.0o Standard LaTeX package
)))
("C:\Program Files\MiKTeX 2.9\tex/latex/l3backend\l3backend-pdfmode.def"
File: l3backend-pdfmode.def 2020-03-12 L3 backend support: PDF mode
\l__kernel_color_stack_int=\count193
\l__pdf_internal_box=\box48
)
(lecture22.aux)
\openout1 = `lecture22.aux'.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for TS1/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
("C:\Program Files\MiKTeX 2.9\tex/context/base/mkii\supp-pdf.mkii"
[Loading MPS to PDF converter (version 2006.09.02).]
\scratchcounter=\count194
\scratchdimen=\dimen156
\scratchbox=\box49
\nofMPsegments=\count195
\nofMParguments=\count196
\everyMPshowfont=\toks26
\MPscratchCnt=\count197
\MPscratchDim=\dimen157
\MPnumerator=\count198
\makeMPintoPDFobject=\count199
\everyMPtoPDFconversion=\toks27
) ("C:\Program Files\MiKTeX 2.9\tex/latex/epstopdf-pkg\epstopdf-base.sty"
Package: epstopdf-base 2020-01-24 v2.11 Base part for package epstopdf
("C:\Program Files\MiKTeX 2.9\tex/generic/infwarerr\infwarerr.sty"
Package: infwarerr 2019/12/03 v1.5 Providing info/warning/error messages (HO)
)
("C:\Program Files\MiKTeX 2.9\tex/latex/grfext\grfext.sty"
Package: grfext 2019/12/03 v1.3 Manage graphics extensions (HO)
("C:\Program Files\MiKTeX 2.9\tex/generic/kvdefinekeys\kvdefinekeys.sty"
Package: kvdefinekeys 2019-12-19 v1.6 Define keys (HO)
))
("C:\Program Files\MiKTeX 2.9\tex/latex/kvoptions\kvoptions.sty"
Package: kvoptions 2019/11/29 v3.13 Key value format for package options (HO)
("C:\Program Files\MiKTeX 2.9\tex/generic/ltxcmds\ltxcmds.sty"
Package: ltxcmds 2019/12/15 v1.24 LaTeX kernel commands for general use (HO)
)
("C:\Program Files\MiKTeX 2.9\tex/generic/kvsetkeys\kvsetkeys.sty"
Package: kvsetkeys 2019/12/15 v1.18 Key value parser (HO)
))
("C:\Program Files\MiKTeX 2.9\tex/latex/pdftexcmds\pdftexcmds.sty"
Package: pdftexcmds 2019/11/24 v0.31 Utility functions of pdfTeX for LuaTeX (HO
)
("C:\Program Files\MiKTeX 2.9\tex/generic/iftex\iftex.sty"
Package: iftex 2020/03/06 v1.0d TeX engine tests
)
Package pdftexcmds Info: \pdf@primitive is available.
Package pdftexcmds Info: \pdf@ifprimitive is available.
Package pdftexcmds Info: \pdfdraftmode found.
)
Package epstopdf-base Info: Redefining graphics rule for `.eps' on input line 4
85.
Package grfext Info: Graphics extension search list:
(grfext) [.pdf,.png,.jpg,.mps,.jpeg,.jbig2,.jb2,.PDF,.PNG,.JPG,.JPE
G,.JBIG2,.JB2,.eps]
(grfext) \AppendGraphicsExtensions on input line 504.
)
Chapter 1.
LaTeX Font Info: Trying to load font information for U+msa on input line 7.
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\umsa.fd"
File: umsa.fd 2013/01/14 v3.01 AMS symbols A
)
LaTeX Font Info: Trying to load font information for U+msb on input line 7.
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\umsb.fd"
File: umsb.fd 2013/01/14 v3.01 AMS symbols B
)
<../img/lez22-img1.JPG, id=1, 164.86594pt x 105.39375pt>
File: ../img/lez22-img1.JPG Graphic file (type jpg)
<use ../img/lez22-img1.JPG>
Package pdftex.def Info: ../img/lez22-img1.JPG used on input line 16.
(pdftex.def) Requested size: 117.00119pt x 74.79791pt.
[1
{C:/Users/AndreDany/AppData/Local/MiKTeX/2.9/pdftex/config/pdftex.map} <../img/
lez22-img1.JPG>]
Underfull \hbox (badness 10000) in paragraph at lines 58--67
[]
Underfull \hbox (badness 10000) in paragraph at lines 69--70
[]
Underfull \hbox (badness 10000) in paragraph at lines 100--105
[]
[2]
Underfull \hbox (badness 10000) in paragraph at lines 111--116
[]
[3]
Underfull \hbox (badness 10000) in paragraph at lines 143--148
[]
Underfull \hbox (badness 10000) in paragraph at lines 158--161
[]
Underfull \hbox (badness 10000) in paragraph at lines 165--166
[]
<../img/lez22-img2.JPG, id=27, 216.05719pt x 155.07938pt>
File: ../img/lez22-img2.JPG Graphic file (type jpg)
<use ../img/lez22-img2.JPG>
Package pdftex.def Info: ../img/lez22-img2.JPG used on input line 169.
(pdftex.def) Requested size: 117.00119pt x 83.98059pt.
[4 <../img/lez22-img2.JPG>] [5] (lecture22.aux) )
Here is how much of TeX's memory you used:
5119 strings out of 480934
69160 string characters out of 2909670
333085 words of memory out of 3000000
20867 multiletter control sequences out of 15000+200000
546954 words of font info for 58 fonts, out of 3000000 for 9000
1141 hyphenation exceptions out of 8191
42i,7n,50p,333b,236s stack positions out of 5000i,500n,10000p,200000b,50000s
<C:\Users\AndreDany\AppData\L
ocal\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecti1200.pk> <C:\Users\Andre
Dany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx1440.pk> <
C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\
tcrm1200.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknap
pen/ec/dpi600\ecbx1200.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/p
k/ljfour/jknappen/ec/dpi600\ecrm1200.pk> <C:\Users\AndreDany\AppData\Local\MiKT
eX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx1728.pk> <C:\Users\AndreDany\AppD
ata\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx2488.pk><C:/Program
Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmex10.pfb><C:/Program Files/M
iKTeX 2.9/fonts/type1/public/amsfonts/cm/cmmi12.pfb><C:/Program Files/MiKTeX 2.
9/fonts/type1/public/amsfonts/cm/cmmi6.pfb><C:/Program Files/MiKTeX 2.9/fonts/t
ype1/public/amsfonts/cm/cmmi8.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/publ
ic/amsfonts/cm/cmr12.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfon
ts/cm/cmr6.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmr8
.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmsy10.pfb><C:
/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmsy8.pfb><C:/Program
Files/MiKTeX 2.9/fonts/type1/public/amsfonts/symbols/msbm10.pfb>
Output written on lecture22.pdf (5 pages, 183434 bytes).
PDF statistics:
215 PDF objects out of 1000 (max. 8388607)
0 named destinations out of 1000 (max. 500000)
11 words of extra memory for PDF output out of 10000 (max. 10000000)

View File

@ -0,0 +1,181 @@
\documentclass[../main.tex]{subfiles}
\begin{document}
\chapter{Lecture 22 - 26-05-2020}
\section{Continous of Pegasos}
$$
w_s = arg \min (\hat{\ell}_s(w) + \frac{\lambda}{2} \|w\|^2
\qquad \frac{(2 \, L )^2}{\lambda \, m}-stable
$$
$$
\ell(w, (x,y)) = \left[ 1 - y w^T x \right]_+
$$
\begin{figure}[h]
\centering
\includegraphics[width=0.3\linewidth]{../img/lez22-img1.JPG}
\caption{}
%\label{fig:}
\end{figure}
$$
\nabla \ell(w, (x,y)) = - y x I \{ w^T \, x \leq 1 \} \qquad \| \nabla \ell(w,z) \| \leq \|x \| \leq X
$$
$$
\ell(w,z) - \ell(w,z) \leq \nabla \ell(w',z)^T (w-w') \leq \| \red{ \nabla \ell(w',z) \| } \| w-w'\|
$$
where \bred{red} is equal to $X$
$$
\hat{\ell}_s(w_s) \leq \hat{\ell}(w_s) + \frac{1}{2} \|w_s \|^2 \leq \hat{\ell}_s(u) + \frac{1}{2} \| u \|^2 \qquad \forall u \in \barra{R}^d
$$
$$
E [ \ell_D(w_s)] \leq E[ \hat{\ell}(w_s)] + \frac{4 \, x^2}{\lambda \, m} \leq E [ \hat{\ell}_s(u) + \frac{1}{2} \|u\|^2 ] + \frac{4 \, X^2}{\lambda \, m} =
$$
$$
= \ell_D(u) + \frac{\lambda}{2} \| u \|^2 + \frac{4 \, x^2}{\lambda \, m}
$$
$$
E [ \ell_D(w_s) ] \leq min( \ell_D(u) + \frac{\lambda}{2}
\| u\|^2) +\frac{4 \, x^2}{\lambda \, m} $$
$$
\ell_D^{0-1}(w_s) \leq \ell_D(w_s)
$$
$$
0-1 \ loss \ \leq \ hinge
$$
$$
E [ \ell_D(w_s) ] + \ell_D(u) + \frac{\lambda}{2} \|u \|^2 + \frac{4 \, x^2}{\lambda \, m } \qquad \lambda \approx \frac{1}{\sqrt[]{m}}
$$
\\
We can run SVM in a Kernel space $H_k$:
$$
g_s = arg \min_{g \in H_k} (\hat{\ell}_s(g) - \frac{\lambda}{2} \|g\|^2 k )
$$
$$
g = \sum_{i = 1}^N \alpha_i \, k (x_i, \cdot) \qquad h_t(g) = [ 1-y_t g(x_t) ]_+
$$
\\
If $H_k$ is the kernel space induced by the Gaussian Kernel, then elements of $g$ can approximate any continous function $\Rightarrow$ \bred{ consistency}
\\
SVM with Gaussian Kernel is consistent if $\lambda = \lambda_m $ \qquad (with $0$-$1$ loss)
\\
1) $\lambda_m = o(\lambda)$\\
2) $\lambda_m = w (m^{-\frac{1}{2}}) $
\\
$$
\lambda_m \approx \frac{\ln m}{\sqrt[]{m}} \quad \surd
$$\\
\section{Boosting and ensemble predictors }
Examples:
\begin{itemize}
\item Stochastic gradiant descent (SGD)
\end{itemize}
$A \qquad h_1, ..., h_T$ \ Given $S$, example from $S$: $_1,...,S_T$
\\
$h_1 = A(S_1)$ \ is the output 1
\\
Assume we are doing binary classification with $0$-$1$ $loss$.
\\
$
h_1,...,h_T : X \rightarrow \{-1,1\}
$ \qquad (We go for a majority vote classifier)
\\
$
x \quad h_1(x),...,h_T(x) \in \{-1,1\} \qquad f = sgn \left( \sum_{t=1}^T h_t \right)
$
\\
Ideal condition $Z$ is the index of a training example from $S$ drawn at random (uniformly):
$$
P \left(h_1(x_2) \neq y_z \wedge ... \wedge h_t(x_z) \neq y_z \right) = \prod_{i=1}^T P\left(h_i(x_z) \neq y_z \right)
$$
The error probability of each $h_i$ is independent from the others.
\\
Define the training error of the classifier:
$$
\hat{\ell}_s(h_i) = \frac{1}{m} \sum_{t=1}^m I \{h_t(x_t) \neq y_t \} = P \left(h_t(x_z) \neq y_z \right)
$$
We can assume $\hat{\ell}_s(h_i) \leq \frac{1}{2} \quad \forall i = 1,...,T$
\\
(Take $h_i$ or any $h_T$)
\\\\
I want to bound my majority vote $f$
$$
\hat{\ell}_s(f) = P \left( f(x_z) \neq y_z \right) = P\left( \sum_{i=1}^T I \{h_i(x_z) \neq y_z \} > \frac{T}{2} \right)
$$
If half of them are wrong
$$
\hat{\ell}_{ave} = \frac{1}{T} \sum_{i=1}^T \hat{\ell}_s(h_t) \ = \ P \left( \frac{1}{T} \sum_{i=1}^T I \{ h_i(x_z) \neq y_z \} > \hat{\ell}_{ave} + \left( \frac{1}{2} -\hat{\ell}_ave\right) \right)
$$
$ B_1, ..., B_T \quad B_1 = I \{ h_i (x_z) \neq y_z \}$
\\
And because of our independence assumption, we know that $B_1,..,B_T$ are independent
\\
$$
E\left[ B_i \right] = \hat{\ell}_s(h_i)
$$
We can apply Chernoff-Hoffding bounds to $B_1,...,B_t$ even if they don't have the same expectations
$$
P \left( \frac{1}{T} \sum_{i=1}^T B_i > \hat{\ell}_ave + \varepsilon \right) \leq e^{-2 \, \varepsilon^2 \, T} \qquad \varepsilon = \frac{1}{2} - \hat{\ell}_{ave} \geq 0
$$
$$
P(f(x_z) \neq y_z) \leq e^{-2 \, \varepsilon^2 \, T}
\qquad
\gamma_i = \frac{1}{2} - \hat{\ell}_s(h_i) \quad \frac{1}{T} \sum_i \gamma_i = \frac{1}{2} - \hat{\ell}_{ave}
$$
$$
\hat{\ell}(f) \leq \exp\left(-2 T \left( \frac{1}{T} \sum_i \gamma_i \right)^2\right)
$$
\\
where $\gamma_i$ is the edge of $h_i$
\\
If $\gamma_i \geq \gamma \forall i = 1,...,T$, then the training error of my majority vote is: $$\hat{\ell}(f) \leq e^{-2 \, T \, \gamma^2}$$
How do we get independence of $h_i(x_z) \neq y_z$?
\\
We can't guarantee this!
\\
The subsampling of $S$ is attempting to achieve this independence.
\newpage
\subsection{Bagging}
It is a meta algorithm!
\\
$S_i$ is a random (with replacement) subsample of $S$ of size $|s_i| = |S|$.
\\ So the subsample have the same size of the initial training.
\\
$$| S_i \nabla S| \qquad |S_i \cap S | \leq \frac{2}{3}$$
$N = $ \# of unique points in $S_i$ (did non draw them twice from $S$)
\\
$x_t = I \{ (x_t,y_t)$ is drawn in $S_i \}$ \qquad $P(x_t = 0 ) = (1- \frac{1}{m})$
$$
E [N] = \sum_{t=1}^m P(x_t=1)
\ = \ \sum_{t=1}^m (1 - (1-\frac{1}{m})^m)
\ = \ m -m (1-\frac{1}{m})^m
$$
Fraction of unique points in $S :$
$$ \frac{E[N]}{m} = 1-(1- \frac{1}{m})^m =_{m \rightarrow \infty} 1-e^{-1} \approx 0,63$$
So $\frac{1}{3}$ will be missing.
\\
\subsection{Random Forest}
Independence of errors helps bias.\\ randomisation of subsampling helps variance.
\begin{itemize}
\item 1) Bagging over Tree classifiers (predictors)
\item 2) Subsample of features\\
\end{itemize}
\begin{figure}[h]
\centering
\includegraphics[width=0.3\linewidth]{../img/lez22-img2.JPG}
\caption{}
%\label{fig:}
\end{figure}
Control $H$ of subsample features depth of each tree.
\\Random forest is typically good on many learning tasks.
\\
Boosting is more recent than bagging and builds independent classifiers "by design".
$$ \hat{\ell}(f) \leq e^{-2 \, T \gamma^2} \qquad \gamma_i> \gamma$$
$$ \gamma_i = \frac{1}{2}-\hat{\ell}_s(h_i) \quad \textit{edge of $h_i$}
$$
where $ \hat{\ell}_s(h_i)$ is weighted training error
\end{document}