This commit is contained in:
Andreaierardi 2020-05-15 17:32:34 +02:00
parent 6ca2c25e70
commit 03cbc7d44c
6 changed files with 211 additions and 30 deletions

View File

@ -9,3 +9,4 @@
\@writefile{lof}{\contentsline {figure}{\numberline {1.1}{\ignorespaces }}{2}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.2}{\ignorespaces }}{2}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.3}{\ignorespaces }}{3}\protected@file@percent }
\@writefile{lof}{\contentsline {figure}{\numberline {1.4}{\ignorespaces }}{4}\protected@file@percent }

View File

@ -1,4 +1,4 @@
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (MiKTeX 2.9.7300 64-bit) (preloaded format=pdflatex 2020.4.13) 15 MAY 2020 13:51
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (MiKTeX 2.9.7300 64-bit) (preloaded format=pdflatex 2020.4.13) 15 MAY 2020 17:32
entering extended mode
**./lecture15.tex
(lecture15.tex
@ -363,39 +363,110 @@ File: ../img/lez15-img3.JPG Graphic file (type jpg)
Package pdftex.def Info: ../img/lez15-img3.JPG used on input line 89.
(pdftex.def) Requested size: 117.00119pt x 83.98506pt.
Underfull \hbox (badness 10000) in paragraph at lines 82--93
Underfull \hbox (badness 10000) in paragraph at lines 96--101
[]
[3 <../img/lez15-img3.JPG>] (lecture15.aux) )
Overfull \hbox (43.52298pt too wide) detected at line 109
\OT1/cmr/m/n/12 = \OMS/cmsy/m/n/12 [] \OT1/cmr/m/n/12 (\OML/cmm/m/it/12 w[] \O
MS/cmsy/m/n/12 \OML/cmm/m/it/12 w[]\OT1/cmr/m/n/12 )[] (\OML/cmm/m/it/12 w[]
\OMS/cmsy/m/n/12 \OML/cmm/m/it/12 u\OT1/cmr/m/n/12 ) = [] [] \OMS/cmsy/m/n
/12 
[]
<../img/lez15-img4.JPG, id=27, 161.10187pt x 107.65219pt>
File: ../img/lez15-img4.JPG Graphic file (type jpg)
<use ../img/lez15-img4.JPG>
Package pdftex.def Info: ../img/lez15-img4.JPG used on input line 117.
(pdftex.def) Requested size: 117.00119pt x 78.18626pt.
Underfull \hbox (badness 10000) in paragraph at lines 113--125
[]
[3 <../img/lez15-img3.JPG>]
Overfull \hbox (88.26935pt too wide) detected at line 137
\OT1/cmr/m/n/12 = [][] \OMS/cmsy/m/n/12 [][] \OT1/cmr/m/n/12 + []\OMS/cmsy/m/
n/12 k\OML/cmm/m/it/12 w[] \OMS/cmsy/m/n/12 \OML/cmm/m/it/12 w[]\OMS/cmsy/m/n
/12 k[]
[]
Underfull \hbox (badness 10000) in paragraph at lines 137--139
[]
Underfull \hbox (badness 10000) in paragraph at lines 142--145
[]
Overfull \hbox (52.34819pt too wide) detected at line 147
\OMS/cmsy/m/n/12  []k\OML/cmm/m/it/12 w[] \OMS/cmsy/m/n/12 \OML/cmm/m/it/12
u\OMS/cmsy/m/n/12 k[] []k\OML/cmm/m/it/12 w[] \OMS/cmsy/m/n/12 \OML/cmm/m/i
t/12 u\OMS/cmsy/m/n/12 k[] \OT1/cmr/m/n/12 + [] [] \OMS/cmsy/m/n/12 k\OML/cmm/m
/it/12 w[] \OMS/cmsy/m/n/12 \OML/cmm/m/it/12 u\OMS/cmsy/m/n/12 k[] [] \OT1/cm
r/m/n/12 + [] [] []
[]
Overfull \hbox (10.22507pt too wide) in paragraph at lines 147--153
\T1/cmr/m/n/12 where $\OML/cmm/m/it/12 w[] \OT1/cmr/m/n/12 = 0$ \T1/cmr/m/n/12
and $\OMS/cmsy/m/n/12 k\OML/cmm/m/it/12 w[] \OMS/cmsy/m/n/12 \OML/cmm/m/it/
12 u\OMS/cmsy/m/n/12 k[]  \OT1/cmr/m/n/12 4 \OML/cmm/m/it/12 U[]$ \T1/cmr/m/n
/12 and $\OMS/cmsy/m/n/12 k\OML/cmm/m/it/12 w[] \OMS/cmsy/m/n/12 \OML/cmm/m/
it/12 w[]\OMS/cmsy/m/n/12 k[] \OT1/cmr/m/n/12 = \OML/cmm/m/it/12 []\OMS/cmsy/m
/n/12 kr\OML/cmm/m/it/12 `[]\OT1/cmr/m/n/12 (\OML/cmm/m/it/12 w[]\OT1/cmr/m/n/1
2 )\OMS/cmsy/m/n/12 k[]$
[]
Underfull \hbox (badness 10000) in paragraph at lines 164--171
[]
[4 <../img/lez15-img4.JPG>]
Underfull \hbox (badness 10000) in paragraph at lines 196--204
[]
Underfull \hbox (badness 10000) in paragraph at lines 196--204
[]
[5] (lecture15.aux) )
Here is how much of TeX's memory you used:
5121 strings out of 480934
69325 string characters out of 2909670
332085 words of memory out of 3000000
20869 multiletter control sequences out of 15000+200000
547529 words of font info for 56 fonts, out of 3000000 for 9000
5132 strings out of 480934
69509 string characters out of 2909670
335085 words of memory out of 3000000
20877 multiletter control sequences out of 15000+200000
548174 words of font info for 59 fonts, out of 3000000 for 9000
1141 hyphenation exceptions out of 8191
42i,7n,50p,333b,236s stack positions out of 5000i,500n,10000p,200000b,50000s
<C:\Users\AndreDany\AppData\Local
\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx1440.pk> <C:\Users\AndreDany
\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecti1200.pk> <C:\U
sers\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx
1200.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/
ec/dpi600\ecrm1200.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/lj
four/jknappen/ec/dpi600\ecbx1728.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2
.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx2488.pk><C:/Program Files/MiKTeX 2.9/
fonts/type1/public/amsfonts/cm/cmex10.pfb><C:/Program Files/MiKTeX 2.9/fonts/ty
pe1/public/amsfonts/cm/cmmi12.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/publ
ic/amsfonts/cm/cmmi6.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfon
ts/cm/cmmi8.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmr
12.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmr8.pfb><C:
/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmsy10.pfb><C:/Program
Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmsy8.pfb><C:/Program Files/Mi
KTeX 2.9/fonts/type1/public/amsfonts/symbols/msam10.pfb><C:/Program Files/MiKTe
X 2.9/fonts/type1/public/amsfonts/symbols/msbm10.pfb>
Output written on lecture15.pdf (3 pages, 183434 bytes).
<C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljf
our/jknappen/ec/dpi600\ecbx1440.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.
9\fonts/pk/ljfour/jknappen/ec/dpi600\ecti1200.pk> <C:\Users\AndreDany\AppData\L
ocal\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecbx1200.pk> <C:\Users\Andre
Dany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\ecrm1200.pk> <
C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknappen/ec/dpi600\
ecbx1728.pk> <C:\Users\AndreDany\AppData\Local\MiKTeX\2.9\fonts/pk/ljfour/jknap
pen/ec/dpi600\ecbx2488.pk><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfo
nts/cm/cmex10.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/c
mmi12.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmmi6.pfb
><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmmi8.pfb><C:/Prog
ram Files/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmr12.pfb><C:/Program Files
/MiKTeX 2.9/fonts/type1/public/amsfonts/cm/cmr6.pfb><C:/Program Files/MiKTeX 2.
9/fonts/type1/public/amsfonts/cm/cmr8.pfb><C:/Program Files/MiKTeX 2.9/fonts/ty
pe1/public/amsfonts/cm/cmsy10.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/publ
ic/amsfonts/cm/cmsy8.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfon
ts/symbols/msam10.pfb><C:/Program Files/MiKTeX 2.9/fonts/type1/public/amsfonts/
symbols/msbm10.pfb>
Output written on lecture15.pdf (5 pages, 220467 bytes).
PDF statistics:
215 PDF objects out of 1000 (max. 8388607)
234 PDF objects out of 1000 (max. 8388607)
0 named destinations out of 1000 (max. 500000)
16 words of extra memory for PDF output out of 10000 (max. 10000000)
21 words of extra memory for PDF output out of 10000 (max. 10000000)

View File

@ -18,7 +18,7 @@ The important thing is that $\ell_1 , \ell_2, ...$ is a sequence of \textbf{conv
\\\\
In general we define the regret in this way:
$$
R_T(u) \ = \ \frac{1}{m} \ \sum_{t=1}^{T} \ell_t(w_t) - \frac{1}{T} \ \sum_{t=1}^{T} \ell_t(u_t)
R_T(u) \ = \ \sum_{t=1}^{T} \ell_t(w_t) - \sum_{t=1}^{T} \ell_t(u_t)
$$\\
The Gradiant descent is one of the simplest algorithm for minimising a convex function. We recall the iteration did by the algorithm:
$$
@ -90,6 +90,115 @@ $
\caption{}
%\label{fig:}
\end{figure}\\
Now we can apply this results to our problem: in particular I rearrange the factors
$$
f(w) - f(u) \leq \nabla f(w)^T \, (w-u)
$$
This is Ok for $f$ convex and differentiable.
\\
I know that: $u-w^T\nabla^2 f(\xi) \, (u-w) \geq 0 $ because f is convex.
\\
$$
\ell_t(w_t) - \ell_t(u) \leq \nabla \ell_t (w_t)^T \, (w_t-u) \qquad \textbf{ Linear Regret}
$$\\
How do we proceed? \\
The first step of the algorithm is : $w'_{t+1} = w_t - \eta_t \nabla \ell_t(w_t) \qquad \eta_t = \frac{\eta}{\sqrt[]{t}}$
$$
= - \frac{1}{\eta_t} \, (w'_{t+1} - w_t )^T \, (w_t-u) \ = \
\frac{1}{\eta_t}\left( \frac{1}{2} \| w_t -u \|^2 - \frac{1}{2} \| w'_{t+1} - u \|^2 + \frac{1}{2} \|w_{t+1} - w_t \|^2 \right) \ \leq
$$
$$
\leq \
\frac{1}{\eta_t}\left( \frac{1}{2} \| w_t -u \|^2 - \frac{1}{2} \| w_{t+1} - u \|^2 + \frac{1}{2} \|w'_{t+1} - w_t \|^2 \right)
$$
$w'$ disappear and add minus sign. I am saying that $\| w_{t+1} -u \| \leq \| w'_{t+1} - u\|$
\begin{figure}[h]
\centering
\includegraphics[width=0.3\linewidth]{../img/lez15-img4.JPG}
\caption{}
%\label{fig:}
\end{figure}\\
So is telling us that $w_{t+1}$ is closer to $u$ than $w'_{t+1}$
\\This holds since the ball is convex.
\\\\
Now we go back adding and subtracting $\pm \frac{1}{2 \, \eta_{t+1}} \| w_{t+1} -u \|^2$
min. 38:38
$$
= \red{\frac{1}{2 \, \eta_t} \| w_t - u\|^2 -
\frac{1}{2 \, \eta_{t+1}} \| w_{t+1} - u\|^2}
-
\blue{ $
\frac{1}{2 \, \eta_{t}} \|w_{t+1}-u \|^2
+
\frac{1}{2 \, \eta_{t+1}} \|w_{t+1}-u \|^2 $}
+
\frac{1}{2 \, \eta_{t}} \|w_{t+1}- w_t \|^2
$$
We group the 1,2 and 3,4 elements and sum them up. \\
$$
R_T(U) \ = \ \sum_{t=1}^{T} \left( \ell_t(w_t) - \ell_t(u) \right) \leq
$$
This is a \textbf{telescopic sum}: $a_1-a_2+a_2-a_3+a_3-a_4+a_t -a_t+1$ and everything in the middle cancel out and remains first and last terms.
\\
$$
\leq \frac{1}{2 \, \eta_t} \|w_1-u\|^2 - \frac{1}{\eta_{T+1}} \| w_{T+1} -u \| ^2 + \frac{1}{2} \sum_{t=1}^T \| w_{t+1} - u \|^2 \left( \frac{1}{\eta_{t+1}} - \frac{1}{\eta_t}\right) + \frac{1}{2} \sum_{t=1}^T \frac{\| w'_{t+1} -w_t \|^2}{\eta_t}
$$
where $w_1 = 0 $ \quad and \quad $\| w_{t+1} - u \|^2 \leq 4 \, U^2$ \quad and \quad
$
\| w'_{t+1} -w_t \|^2 = \eta_t^2 \| \nabla \ell_t (w_t) \|^2
$\\
We know that $\eta_t = \frac{\eta}{\sqrt[]{t}}$ \qquad so $\eta_1 = \frac{\eta}{\sqrt[]{1}} = \eta$
$$
R_T(U) \ \leq \ \frac{1}{2 \, \eta} U^2
\red{
- \frac{1}{2 \, \eta_{T+1}} \|w_{T+1} - U \|^2
}
+2 \, U^2 \, \sum_{t=1}^{T-1} \left(\frac{1}{\eta_t} - \frac{1}{\eta_t}\right)+
$$
$$
\red{
+ \frac{\| w_{T+1} -U \|^2}{2 \eta_{T+1}} - \frac{\| w_{T} -U \|^2}{\eta_T} }
+ \frac{1}{2} \sum_{t=1}^T \eta_t \| \nabla \ell_t(w_t) \|^2
$$
where \bred{red values} cancel out.
\\
I assume that square loss is bounded by some number $G^2$: $ \| \nabla \ell_t(w_t) \|^2 \ \leq \ G^2 $
\\
Also, it's a telescopic sum again and all middle terms cancel out.
\\
$$
\max_t \| \nabla_t(w_t) \|^2 \leq G
$$
$$
R_T(U) \leq \red{\frac{1}{2 \, \eta} U^2} + 2\, U^2 \left( \frac{1}{\eta_{T}} - \red{ \frac{1}{\eta_1} }\right) + \frac{G^2}{2} \, \eta \sum_{t=1}^T \frac{1}{\sqrt[]{t}} \qquad \eta_t = \frac{1}{\sqrt[]{t}}
$$
where \bred{red values} cancel out.
\\
Now how much is this sum $ \sum_{t=1}^T \frac{1}{\sqrt[]{t}}$?
\\
It is bounder by the integral
$
\leq \int_{1}^{T} \frac{dx}{\sqrt[]{x}} \leq 2 \, \sqrt[]{T}
$
$$
R_T(U) \ \leq \ \frac{2 \, U^2 \sqrt[]{T}}{\eta} + \eta \, G^2 \sqrt[]{T} \ = \ \left( \frac{2\, U^2}{\eta} + \eta \, G^2 \right) \, \sqrt[]{T} $$
$ \eta = \frac{U}{G} \sqrt[]{2}
$
\\
So finally:
$$
\frac{1}{T} \sum_{t=1}^T \ell_t(w_t) \leq \min_{\|U\| \leq U} \frac{1}{T} \sum_{t=1}^T \ell_t(u) + U \, G \ \sqrt[]{\frac{8}{T}}
$$
$$
R_T(U) = \frac{1}{T} \sum_{t=1}^T \left( \ell_t (w_t) - \ell_t(u) \right) \qquad \forall u : \|u\| \leq U : R_T(U) = O \left(\frac{1}{\sqrt[]{T}}\right)
$$
Basically my regret is gonna go to 0.
\\\\
For $ERM in H$ where $| H| < \infty$, variance error vanishes at rate $\frac{1}{\sqrt[]{m}}$
\\\\
The bound $U \, G^2 \, \sqrt[]{\frac{8}{T}}$ on regret holds for any sequence $\ell_1, \ell_2, ...$ of convex and affordable losses, If $\ell_t(w) = \ell(w^T \, x_t, y_t)$ then the bound holds for any sequence of data points $(x_1,y_1), (x_2, y_2)..$
\\
This is not a statistical assumption but mathematical so stronger.
\end{document}