mirror of
synced 2025-01-07 10:05:56 +01:00
This commit is contained in:
Binary file not shown.
After Width: | Height: | Size: 12 KiB |
Binary file not shown.
After Width: | Height: | Size: 13 KiB |
Binary file not shown.
After Width: | Height: | Size: 9.7 KiB |
Binary file not shown.
After Width: | Height: | Size: 12 KiB |
@ -0,0 +1,12 @@
\@writefile{toc}{\contentsline {chapter}{\numberline {1}Lecture 16 - 05-05-2020}{1}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {section}{\numberline {1.1}Analysis of Perceptron in the non-separable case using OGD framework.}{1}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.1}{\ignorespaces }}{1}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.2}{\ignorespaces Hinge loss}}{2}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.3}{\ignorespaces }}{3}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {1.1.1}Strongly convex loss functions}{5}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.4}{\ignorespaces Example of more type of convex function}}{5}\protected@file@percent }
@ -0,0 +1,470 @@
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (MiKTeX 2.9.7300 64-bit) (preloaded format=pdflatex 2020.4.13) 16 MAY 2020 12:53
entering extended mode
LaTeX2e <2020-02-02> patch level 5
L3 programming layer <2020-03-06>
("C:\Program Files\MiKTeX 2.9\tex/latex/subfiles\subfiles.cls"
Document Class: subfiles 2020/02/14 v1.6 Multi-file projects (class)
Preamble taken from file `../main.tex'
("C:\Program Files\MiKTeX 2.9\tex/latex/tools\verbatim.sty"
Package: verbatim 2019/11/10 v1.5r LaTeX2e package for verbatim enhancements
("C:\Program Files\MiKTeX 2.9\tex/latex/import\import.sty"
Package: import 2020/04/01 v 6.2
) (../main.tex
("C:\Program Files\MiKTeX 2.9\tex/latex/base\report.cls"
Document Class: report 2019/12/20 v1.4l Standard LaTeX document class
("C:\Program Files\MiKTeX 2.9\tex/latex/base\size12.clo"
File: size12.clo 2019/12/20 v1.4l Standard LaTeX file (size option)
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsmath.sty"
Package: amsmath 2020/01/20 v2.17e AMS math features
For additional information on amsmath, use the `?' option.
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amstext.sty"
Package: amstext 2000/06/29 v2.01 AMS text
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsgen.sty"
File: amsgen.sty 1999/11/30 v2.0 generic functions
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsbsy.sty"
Package: amsbsy 1999/11/29 v1.2d Bold Symbols
("C:\Program Files\MiKTeX 2.9\tex/latex/amsmath\amsopn.sty"
Package: amsopn 2016/03/08 v2.02 operator names
LaTeX Info: Redefining \frac on input line 227.
LaTeX Info: Redefining \overline on input line 389.
LaTeX Info: Redefining \ldots on input line 486.
LaTeX Info: Redefining \dots on input line 489.
LaTeX Info: Redefining \cdots on input line 610.
LaTeX Font Info: Redeclaring font encoding OML on input line 733.
LaTeX Font Info: Redeclaring font encoding OMS on input line 734.
LaTeX Info: Redefining \[ on input line 2859.
LaTeX Info: Redefining \] on input line 2860.
("C:\Program Files\MiKTeX 2.9\tex/latex/systeme\systeme.sty"
("C:\Program Files\MiKTeX 2.9\tex/latex/xstring\xstring.sty"
("C:\Program Files\MiKTeX 2.9\tex/generic/xstring\xstring.tex"
Package: xstring 2019/02/06 v1.83 String manipulations (CT)
("C:\Program Files\MiKTeX 2.9\tex/generic/systeme\systeme.tex"
Package: systeme 2019/01/13 v0.32 Mise en forme de systemes d'equations (CT)
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\amssymb.sty"
Package: amssymb 2013/01/14 v3.01 AMS font symbols
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\amsfonts.sty"
Package: amsfonts 2013/01/14 v3.01 Basic AMSFonts support
LaTeX Font Info: Redeclaring math symbol \hbar on input line 98.
LaTeX Font Info: Overwriting math alphabet `\mathfrak' in version `bold'
(Font) U/euf/m/n --> U/euf/b/n on input line 106.
("C:\Program Files\MiKTeX 2.9\tex/latex/subfiles\subfiles.sty"
Package: subfiles 2020/02/14 v1.6 Multi-file projects (package)
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\babel.sty"
Package: babel 2020/02/28 3.41 The Babel package
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\switch.def"
File: switch.def 2020/02/28 3.41 Babel switching mechanism
* Local config file bblopts.cfg used
("C:\Program Files\MiKTeX 2.9\tex/latex/arabi\bblopts.cfg"
File: bblopts.cfg 2005/09/08 v0.1 add Arabic and Farsi to "declared" options of
("C:\Program Files\MiKTeX 2.9\tex/latex/babel-english\english.ldf"
Language: english 2017/06/06 v3.3r English support from the babel system
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\babel.def"
File: babel.def 2020/02/28 3.41 Babel common definitions
("C:\Program Files\MiKTeX 2.9\tex/generic/babel\txtbabel.def")
Package babel Info: \l@canadian = using hyphenrules for english
(babel) (\language0) on input line 102.
Package babel Info: \l@australian = using hyphenrules for ukenglish
(babel) (\language72) on input line 105.
Package babel Info: \l@newzealand = using hyphenrules for ukenglish
(babel) (\language72) on input line 108.
("C:\Program Files\MiKTeX 2.9\tex/latex/xcolor\xcolor.sty"
Package: xcolor 2016/05/11 v2.12 LaTeX color extensions (UK)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics-cfg\color.cfg"
File: color.cfg 2016/01/02 v1.6 sample color configuration
Package xcolor Info: Driver file: pdftex.def on input line 225.
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics-def\pdftex.def"
File: pdftex.def 2018/01/08 v1.0l Graphics/color driver for pdftex
Package xcolor Info: Model `cmy' substituted by `cmy0' on input line 1348.
Package xcolor Info: Model `hsb' substituted by `rgb' on input line 1352.
Package xcolor Info: Model `RGB' extended on input line 1364.
Package xcolor Info: Model `HTML' substituted by `rgb' on input line 1366.
Package xcolor Info: Model `Hsb' substituted by `hsb' on input line 1367.
Package xcolor Info: Model `tHsb' substituted by `hsb' on input line 1368.
Package xcolor Info: Model `HSB' substituted by `hsb' on input line 1369.
Package xcolor Info: Model `Gray' substituted by `gray' on input line 1370.
Package xcolor Info: Model `wave' substituted by `hsb' on input line 1371.
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\dvipsnam.def"
File: dvipsnam.def 2016/06/17 v3.0m Driver-dependent file (DPC,SPQR)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\graphicx.sty"
Package: graphicx 2019/11/30 v1.2a Enhanced LaTeX Graphics (DPC,SPQR)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\keyval.sty"
Package: keyval 2014/10/28 v1.15 key=value parser (DPC)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\graphics.sty"
Package: graphics 2019/11/30 v1.4a Standard LaTeX Graphics (DPC,SPQR)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics\trig.sty"
Package: trig 2016/01/03 v1.10 sin cos tan (DPC)
("C:\Program Files\MiKTeX 2.9\tex/latex/graphics-cfg\graphics.cfg"
File: graphics.cfg 2016/06/04 v1.11 sample graphics configuration
Package graphics Info: Driver file: pdftex.def on input line 105.
("C:\Program Files\MiKTeX 2.9\tex/latex/sectsty\sectsty.sty"
Package: sectsty 2002/02/25 v2.0.2 Commands to change all sectional heading sty
LaTeX Warning: Command \underbar has changed.
Check if current package is valid.
LaTeX Warning: Command \underline has changed.
Check if current package is valid.
) ("C:\Program Files\MiKTeX 2.9\tex/latex/framed\framed.sty"
Package: framed 2011/10/22 v 0.96: framed or shaded text with page breaks
("C:\Program Files\MiKTeX 2.9\tex/latex/titlesec\titlesec.sty"
Package: titlesec 2019/10/16 v2.13 Sectioning titles
("C:\Program Files\MiKTeX 2.9\tex/latex/base\fontenc.sty"
Package: fontenc 2020/02/11 v2.0o Standard LaTeX package
("C:\Program Files\MiKTeX 2.9\tex/latex/l3backend\l3backend-pdfmode.def"
File: l3backend-pdfmode.def 2020-03-12 L3 backend support: PDF mode
\openout1 = `lecture16.aux'.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for TS1/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 2.
LaTeX Font Info: ... okay on input line 2.
("C:\Program Files\MiKTeX 2.9\tex/context/base/mkii\supp-pdf.mkii"
[Loading MPS to PDF converter (version 2006.09.02).]
) ("C:\Program Files\MiKTeX 2.9\tex/latex/epstopdf-pkg\epstopdf-base.sty"
Package: epstopdf-base 2020-01-24 v2.11 Base part for package epstopdf
("C:\Program Files\MiKTeX 2.9\tex/generic/infwarerr\infwarerr.sty"
Package: infwarerr 2019/12/03 v1.5 Providing info/warning/error messages (HO)
("C:\Program Files\MiKTeX 2.9\tex/latex/grfext\grfext.sty"
Package: grfext 2019/12/03 v1.3 Manage graphics extensions (HO)
("C:\Program Files\MiKTeX 2.9\tex/generic/kvdefinekeys\kvdefinekeys.sty"
Package: kvdefinekeys 2019-12-19 v1.6 Define keys (HO)
("C:\Program Files\MiKTeX 2.9\tex/latex/kvoptions\kvoptions.sty"
Package: kvoptions 2019/11/29 v3.13 Key value format for package options (HO)
("C:\Program Files\MiKTeX 2.9\tex/generic/ltxcmds\ltxcmds.sty"
Package: ltxcmds 2019/12/15 v1.24 LaTeX kernel commands for general use (HO)
("C:\Program Files\MiKTeX 2.9\tex/generic/kvsetkeys\kvsetkeys.sty"
Package: kvsetkeys 2019/12/15 v1.18 Key value parser (HO)
("C:\Program Files\MiKTeX 2.9\tex/latex/pdftexcmds\pdftexcmds.sty"
Package: pdftexcmds 2019/11/24 v0.31 Utility functions of pdfTeX for LuaTeX (HO
("C:\Program Files\MiKTeX 2.9\tex/generic/iftex\iftex.sty"
Package: iftex 2020/03/06 v1.0d TeX engine tests
Package pdftexcmds Info: \pdf@primitive is available.
Package pdftexcmds Info: \pdf@ifprimitive is available.
Package pdftexcmds Info: \pdfdraftmode found.
Package epstopdf-base Info: Redefining graphics rule for `.eps' on input line 4
Package grfext Info: Graphics extension search list:
(grfext) [.pdf,.png,.jpg,.mps,.jpeg,.jbig2,.jb2,.PDF,.PNG,.JPG,.JPE
(grfext) \AppendGraphicsExtensions on input line 504.
Chapter 1.
Overfull \hbox (0.75821pt too wide) in paragraph at lines 6--6
|[]\T1/cmr/bx/n/17.28 Analysis of Per-cep-tron in the non-separable
LaTeX Font Info: Trying to load font information for U+msa on input line 9.
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\umsa.fd"
File: umsa.fd 2013/01/14 v3.01 AMS symbols A
LaTeX Font Info: Trying to load font information for U+msb on input line 9.
("C:\Program Files\MiKTeX 2.9\tex/latex/amsfonts\umsb.fd"
File: umsb.fd 2013/01/14 v3.01 AMS symbols B
Underfull \hbox (badness 10000) in paragraph at lines 14--20
Underfull \hbox (badness 10000) in paragraph at lines 14--20
Underfull \hbox (badness 10000) in paragraph at lines 26--30
<../img/lez16-img1.JPG, id=1, 166.37157pt x 103.13531pt>
File: ../img/lez16-img1.JPG Graphic file (type jpg)
<use ../img/lez16-img1.JPG>
Package pdftex.def Info: ../img/lez16-img1.JPG used on input line 36.
(pdftex.def) Requested size: 155.99762pt x 96.71112pt.
Underfull \hbox (badness 10000) in paragraph at lines 32--40
Underfull \hbox (badness 10000) in paragraph at lines 45--50
{C:/Users/AndreDany/AppData/Local/MiKTeX/2.9/pdftex/config/pdftex.map} <../img/
<../img/lez16-img2.JPG, id=16, 177.66376pt x 97.86563pt>
File: ../img/lez16-img2.JPG Graphic file (type jpg)
<use ../img/lez16-img2.JPG>
Package pdftex.def Info: ../img/lez16-img2.JPG used on input line 66.
(pdftex.def) Requested size: 155.99762pt x 85.93384pt.
Underfull \hbox (badness 10000) in paragraph at lines 56--70
Underfull \hbox (badness 10000) in paragraph at lines 56--70
Underfull \hbox (badness 10000) in paragraph at lines 75--78
Overfull \hbox (37.19511pt too wide) detected at line 94
\OMS/cmsy/m/n/12 [] k\OML/cmm/m/it/12 U\OMS/cmsy/m/n/12 k[] \OT1/cmr/m/n/12
+ [][] + [] [] \OML/cmm/m/it/12 I\OMS/cmsy/m/n/12 f\OML/cmm/m/it/12 y[] w[]x[]
\OMS/cmsy/m/n/12 \OT1/cmr/m/n/12 0\OMS/cmsy/m/n/12 g
[2 <../img/lez16-img2.JPG>]
<../img/lez16-img3.JPG, id=21, 118.94438pt x 66.2475pt>
File: ../img/lez16-img3.JPG Graphic file (type jpg)
<use ../img/lez16-img3.JPG>
Package pdftex.def Info: ../img/lez16-img3.JPG used on input line 106.
(pdftex.def) Requested size: 155.99762pt x 86.88896pt.
Underfull \hbox (badness 10000) in paragraph at lines 102--110
Underfull \hbox (badness 10000) in paragraph at lines 122--128
[3 <../img/lez16-img3.JPG>]
Underfull \hbox (badness 10000) in paragraph at lines 144--153
Underfull \hbox (badness 10000) in paragraph at lines 144--153
<../img/lez16-img4.JPG, id=31, 152.82094pt x 99.37125pt>
File: ../img/lez16-img4.JPG Graphic file (type jpg)
<use ../img/lez16-img4.JPG>
Package pdftex.def Info: ../img/lez16-img4.JPG used on input line 166.
(pdftex.def) Requested size: 155.99762pt x 101.44072pt.
Underfull \hbox (badness 10000) in paragraph at lines 178--183
Underfull \hbox (badness 10000) in paragraph at lines 178--183
Underfull \hbox (badness 10000) in paragraph at lines 178--183
[5 <../img/lez16-img4.JPG>]
Underfull \hbox (badness 10000) in paragraph at lines 196--205
[6] (lecture16.aux) )
Here is how much of TeX's memory you used:
5128 strings out of 480934
69365 string characters out of 2909670
334085 words of memory out of 3000000
20875 multiletter control sequences out of 15000+200000
546801 words of font info for 57 fonts, out of 3000000 for 9000
1141 hyphenation exceptions out of 8191
42i,7n,50p,333b,236s stack positions out of 5000i,500n,10000p,200000b,50000s
our/jknappen/ec/dpi600\ecti1200.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.
9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx1440.pk> <C:\Users\AndreDany\AppData\L
ocal\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx1200.pk> <C:\Users\Andre
Dany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecrm1200.pk> <
ecbx1728.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknap
pen/ec/dpi600\ecbx2488.pk><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfo
nts/cm/cmex10.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/c
mmi12.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmmi6.pfb
><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmmi8.pfb><C:/Prog
ram Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmr12.pfb><C:/Program Files
/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmr8.pfb><C:/Program Files/MiKTeX 2.
9/fonts/type1/public/amsfonts/cm/cmsy10.pfb><C:/Program Files/MiKTeX 2.9/fonts/
type1/public/amsfonts/cm/cmsy8.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/pub
lic/amsfonts/symbols/msam10.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public
Output written on lecture16.pdf (6 pages, 215549 bytes).
PDF statistics:
254 PDF objects out of 1000 (max. 8388607)
0 named destinations out of 1000 (max. 500000)
21 words of extra memory for PDF output out of 10000 (max. 10000000)
Binary file not shown.
Binary file not shown.
@ -3,5 +3,203 @@
\chapter{Lecture 16 - 05-05-2020}
\chapter{Lecture 16 - 05-05-2020}
\section{Analysis of Perceptron in the non-separable case using OGD framework.}
We are finishing up the part of online learning and gradiant descent. \\
Concrete example for the parameter G:
G = \max_t \| \nabla \ell_t(w_t) \|^2 \qquad \ell_t(w) = (w^T \, x_t -y_t)^2 \qquad \|x_t\| \leq X, |y_t| \leq U\, X
\|w\| \leq U, |w^T \, x_t| \leq \| w \| \, \| x_t \|
where $\| w \| \rightsquigarrow U$
and $\|x_t\| \rightsquigarrow X$ (so are bounded by U and X)
Now we want to find the gradiant.
\| \nabla \ell_t(w) \| \leq 2 \, | w^T \, x_t - y_t | \, \|x_t \| \leq 4 \, U \, X \| x_t \| \leq 4 \, U \, X^2
where $w^T \, x_t \ bounded \ U \, X$ and $y_t$ bounded by $U \, X$
R_T(u) = U \, G \ \sqrt[]{8T} \leq 8 (UX)^2 \ \sqrt[]{2 \, T}
How about OGD for classification?\\
The problem is that zero-one loss is not convex (also non-continous).
I \{ y_t \, w^T \, x_t \leq 0 \}
This is zero-one loss for linear classification.
w \leftarrow w - \eta \, \nabla \ell_t(w) \qquad \bred{OGD}
w \leftarrow w + y_t \, x_t \, I \{ y_t \, w^T \, x_t \leq 0 \} \qquad \bred{Perceptron}
We want to make this equal with loss that is convex and also tell us bound with zero one loss. So there is a bunch or problem.
So we want to make this equal but how?
\ell_t(w) = \left[ -y_t \, w^T \, x_t \right]_+ \qquad \left[ z \right]_t = \max \{0,z\}
$$ \
If we take the gradiant of this with respect to $w$:
\nabla \ell_t(w) = -y_t \, x_t \, I \{ y_t \, w^T \, x_t \leq 0 \}
Now - this gradiant is exactly this $\ell_t(w) = \left[ -y_t \, w^T \, x_t \right]_+ $
The problem is not comparable with the number of mistakes so I am not going to have the number of mistakes.
\\\\How do I do it?
What if I just shift to the right? \\
Now this loss is an upper bound of the zero-one loss. And this is called \bred{Hinge loss} (where hinge take the door attached to the frame of the wall)\\
\caption{Hinge loss}
\textbf{Hinge loss: } \ h_t(w) = \left[ 1 - y_t \, w^T \, x_t \right]_+ \geq I \{ y_t \, w^T \, x_t \leq 0 \}
\nabla h_t(h) = - y_t \, x_t \, I\{y_t \, w^T \, x_t \leq 1 \}
The problem is that it becames 0 later on than the original one.
w \leftarrow w - \eta \, \nabla h_t(w) \, I \{ y_t \, w^T x_t \leq 0\}
y_t \, w^T \, x_t \leq 0 \ \ \Rightarrow \ \ y_t \, w^T x_t \, \leq 1
We now apply OGD analysis to $h_t$ considering only the steps $T$ where $I \{ y_t \, w_t^T \, x_t \leq 0 \}
$ and we do not perform projection.
\sum_{t=1}^T \left( h_t \left(w_t \right) - h_t(u) \right) \ I\{y_t \, w_t^T \, x_t \leq 0 \} \leq
$$ \leq \ \frac{1}{2 \, \eta} \, \| U \|^2 +
\frac{1}{2} \, \sum_{t=1}^T \| w_{t+1} - u \| \, \left( \frac{1}{\eta} - \frac{1}{\eta} \right) \, I \{y_t \, w_t^T \, x_t \leq 0 \}
+ \frac{\eta \, G^2}{2} \, \sum_{t=1}^T I \{y_t \, w_t^T x_t \leq 0 \}
where second factor cancel out
- \frac{1}{2 \, \eta} \| w_{T+1} - u \|^2 \qquad \| \nabla h_t(w) \| = |y_t| \| x_t \| \leq X \qquad G = X = \max_t \| x_t \|
where $y_t$ in $\{ -1,1\}$
y_t \, w_t^T \, x_t \leq 0 \ \Rightarrow \ h_t(w_t) \geq 1
\sum_{t=1}^T I \{y_t \, w_t^T x_t \leq 0 \} \ \leq \ \sum_{t=1}^T h_t(w_t) I \{y_t \, w_t^T \, x_t \leq 0 \} \ \leq
\ \sum_{t=1}^T h_t(u ) \, \red{I \{ y_t \, w_t^T \, x_t \leq 0 \}} + \frac{1}{2 \, \eta} \| u \|^2 + \frac{\eta}{2} x^2 \, \sum_{t=1}^T I\{y_t w_t^T \, x_t \leq 0 \}
where $I{..}$ cancel out to have a "nicer" upper bound.
M_T = \sum_{t=1}^T I\{y_t w_t^T \, x_t \leq 0 \}
M_T \leq \sum_{t=1}^T h_t(u) + \frac{1}{2 \, \eta} \|u \|^2 + \frac{\eta}{2} x^2 \ M_T
This is not a regret anymore! Here \textbf{$M_T$ is the number of mistakes} and I compare it with the hinge loss ($h_t(u)$).
I CAN'T USE THIS $\eta = \frac{\|u\|}{x \, \sqrt[]{M_T}}$
but we can replace it in $M_T$.
M_T \ \leq \ \sum_{t=1}^T h_t(u) + \|u\| X \, \sqrt[]{M_T}
M_T \ \leq \ \sum_{t=1}^T h_t(u) + (\| u\| x)^2 + \| u\| \, x \ \sqrt[]{\sum_t h_t(u)}
w \leftarrow w + \eta \, y_t \, x_t \, I\{y_t w_t^T \, x_t \leq 0 \} \qquad w = (0,...0)
If I choose $\eta$ > 0?
w_t = \eta \, \sum_{s=1}^{t-1} y_s \, x_s \, I\{y_s w_s^T \, x_s \leq 0 \} \qquad \forall \ \eta > 0
Also holds because it's true $ \forall \eta > 0$
M_T = \sum_{t=1}^T I\{y_t w_t^T \, x_t \leq 0 \} \quad \textbf{ invariant with respect to $\eta >0$}
It does not matter which $\eta$ we choose. The number of mistakes is going to be the same. This mean that the state of the algorithm (which depends on mistakes) is gonna be the same.
I can run perceptron with $\eta = 1$ and pretend (in the analysis) it was run with $\eta = \frac{\| U \|}{X \, \sqrt[]{M_T}}$
We go back to the bound of $M_T$ . We are actually free to choose any number of U.
If $(x_1, y_1),(x_2,y_2) $ is linearly separable then: \\ $\exists U$ s.t. $y_t \, U^T x_t \geq 1 \ \Rightarrow \ h_t(u) = 0 \ \ \forall t $
$$M_T \ \leq \ \left( \, \| U\| \ X \, \right)^2 \qquad the \ \bred{perceptron convergence theorem.} $$
M_T \ \leq \ \min_{u \in \barra{R}^d} \left( \sum_{t=1}^T h_t(u) + \left( \|U\| \, X \right)^2 + \| U \| \, X \ \sqrt[]{\sum_t h_t(u)} \right)
This are called \bred{Oracle bounds}, the perceptron knows which is the best $U$.
\subsection{Strongly convex loss functions}
We use this to analyse all class of algorithms that regularise the ERM which is the support vector machine. We want to explain what happen using Support vector Machine. For neural networks we cannot do this since NN are not convex and there is not way to "convexify". Convexifying we lose the power of NN.
We said that $\ell_t$ have to be convex. But i have a lot of types of convexity.
\caption{Example of more type of convex function}
This two for example are both convex. In the left this always has a positive curvature, while the right one we have a 0 curvature since is two straight line and not differentiable.
In other word, Hessian on the left positive and definite. On the right Hessian is 0.
We are looking for \bred{strongly convex losses}.
$\ell$ differentiable is $\sigma$-$strongly$ convex if:
\forall u,w \qquad \ell(w) - \ell(u) \leq \nabla \ell(w)^T \, (w-u) - \frac{\sigma}{2}
\, \| w-u\|^2 \qquad \sigma > 0$$
$\sigma$-$SC$ is equivalent to the Hessian having all strictly positive eigeinvalues.
Example, check if strictly convex:
\ell(w) = \frac{1}{2} \| w\|^2 \qquad \frac{1}{2} \| w\|^2 - \frac{1}{2} \| u\|^2 \ \leq^? \ w^T \, (w-u) - \sigma \, \frac{\| w-u\|^2}{2}
whehre $\sigma = 1$
\red{\frac{1}{2} \| w \|^2 }- \frac{1}{2} \|u\|^2 \ \leq^? \ \|w\|^2 - w^T \, u - \frac{\| w-u\|^2}{2} $$
where $\frac{1}{2} \| w \|^2 $ cancel out
0 \ \leq \ \frac{1}{2} \| w\|^2 + \frac{1}{2} \|u\|^2 - \frac{\| w-u\|^2}{2} - w^T \, u
I put $ 0 = ... $
0 \ = \ \frac{1}{2} \| w\|^2 + \frac{1}{2} \|u\|^2 - \frac{\| w-u\|^2}{2} - w^T \, u
So this function is $1$-\textit{strongly convex}
Next lecture we are going to show that we can run OGD with strongly convex functions. We are going to get a better bound. Our regret is gonna vanish much faster than the case of simple convexity.
You can prove that if Hessian is 0, your regret is vanishing with a rate of $ \frac{U \, G}{\sqrt[]{T}}$.
We will shows with strong convexity the OGD will converge much faster with a rate of $ \frac{ \ln T}{T}$.\\
This is what happen in optimisation, we prefer strictly convex function.
Reference in New Issue
Block a user