Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds...

38
images/logos Pondération dynamique dans un cadre multi-tâche pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufi[email protected] LITIS lab., Apprentissage team - INSA de Rouen, France 29 June, 2016 LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights

Transcript of Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds...

Page 1: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pondération dynamique dans un cadre multi-tâchepour réseaux de neurones profonds

RFIA 2016

Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam

[email protected]

LITIS lab., Apprentissage team - INSA de Rouen, France

29 June, 2016

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights

Page 2: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Context

Deep learning TodayDeep learning state of the art

What is new today?Large dataCalculation power (GPUS, clouds)

⇒ optimizationDropoutMomentum, AdaDelta, AdaGrad, RMSProp, Adam, AdamaxMaxout, Local response normalization, local contrastnormalization, batch normalizationRELUTorch, Caffe, Pylearn2, Theano, TensorFlowCNN, RBM, RNN

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 1/21

Page 3: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Context

Deep neural networks (DNN)

x1

x2

x3

x4

x5

x6

y1

y2

Feed-forward neural networkBack-propagation errorTraining deep neural networks is difficult⇒ Vanishing gradient⇒ Pre-training technique [Y.Bengio et al. 06, G.E.Hinton et al. 06]⇒ More parameters⇒ Need more data⇒ Use unlabeled data

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 2/21

Page 4: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Context

Deep neural networks (DNN)

x1

x2

x3

x4

x5

x6

y1

y2

Feed-forward neural networkBack-propagation errorTraining deep neural networks is difficult⇒ Vanishing gradient⇒ Pre-training technique [Y.Bengio et al. 06, G.E.Hinton et al. 06]⇒ More parameters⇒ Need more data⇒ Use unlabeled data

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 2/21

Page 5: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Context

Semi-supervised learning

General case:

Data = { labeled data (x,y)︸ ︷︷ ︸expensive (money, time), few

,unlabeled data (x,−−)︸ ︷︷ ︸cheap, abundant

}

E.g:Collect images from the internetMedical images

⇒ semi-supervised learning:

Exploit unlabeled data to improve the generalization

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 3/21

Page 6: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Context

Semi-supervised learning

General case:

Data = { labeled data (x,y)︸ ︷︷ ︸expensive (money, time), few

,unlabeled data (x,−−)︸ ︷︷ ︸cheap, abundant

}

E.g:Collect images from the internetMedical images

⇒ semi-supervised learning:

Exploit unlabeled data to improve the generalization

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 3/21

Page 7: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Context

Pre-training and semi-supervised learning

The pre-training technique can exploit the unlabeled data

A sequential transfer learning performed in 2 steps:1 Unsupervised task (x labeled and unlabeled data)2 Supervised task ( (x,y) labeled data)

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 4/21

Page 8: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

y1

y2

A DNN to train

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/21

Page 9: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

x1

x2

x3

x4

x5

x6

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/21

1) Step 1: Unsupervised layer-wise pre-training

Train layer by layer sequentially using only x (labeled or unlabeled)

Page 10: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

h1,1

h1,2

h1,3

h1,4

h1,5

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/21

1) Step 1: Unsupervised layer-wise pre-training

Train layer by layer sequentially using only x (labeled or unlabeled)

Page 11: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

h1,1

h1,2

h1,3

h1,4

h1,5

h1,1

h1,2

h1,3

h1,4

h1,5

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/21

1) Step 1: Unsupervised layer-wise pre-training

Train layer by layer sequentially using only x (labeled or unlabeled)

Page 12: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

h2,1

h2,2

h2,3

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/21

1) Step 1: Unsupervised layer-wise pre-training

Train layer by layer sequentially using only x (labeled or unlabeled)

Page 13: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

h2,1

h2,2

h2,3

h2,1

h2,2

h2,3

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/21

1) Step 1: Unsupervised layer-wise pre-training

Train layer by layer sequentially using only x (labeled or unlabeled)

Page 14: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

h3,1

h3,2

h3,3

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/21

1) Step 1: Unsupervised layer-wise pre-training

Train layer by layer sequentially using only x (labeled or unlabeled)

Page 15: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/21

1) Step 1: Unsupervised layer-wise pre-training

Train layer by layer sequentially using only x (labeled or unlabeled)

At each layer:⇒ What hyper-parameters to use? When to stop training?⇒ How to make sure that the pre-training improves the supervised task?

Page 16: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Layer-wise pre-training: auto-encoders

x1

x2

x3

x4

x5

x6

y1

y2

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 7/21

2) Step 2: Supervised training

Train the whole network using (x, y)

Back-propagation

Page 17: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Pre-training technique: Pros and cons

ProsImprove generalizationCan exploit unlabeled dataProvide better initialization than randomTrain deep networks⇒ Circumvent the vanishing gradient problem

ConsAdd more hyper-parametersNo good stopping criterion during pre-training phase

Good criterion for the unsupervised taskBut

May not be good for the supervised task

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 8/21

Page 18: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Pre-training technique: Pros and cons

ProsImprove generalizationCan exploit unlabeled dataProvide better initialization than randomTrain deep networks⇒ Circumvent the vanishing gradient problem

ConsAdd more hyper-parametersNo good stopping criterion during pre-training phase

Good criterion for the unsupervised taskBut

May not be good for the supervised task

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 8/21

Page 19: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Proposed solution

Why is it difficult in practice?

⇒ Sequential transfer learning

Possible solution:

⇒ Parallel transfer learning

Why in parallel?

Interaction between tasksReduce the number of hyper-parameters to tuneProvide one stopping criterion

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 9/21

Page 20: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Proposed solution

Why is it difficult in practice?

⇒ Sequential transfer learning

Possible solution:

⇒ Parallel transfer learning

Why in parallel?

Interaction between tasksReduce the number of hyper-parameters to tuneProvide one stopping criterion

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 9/21

Page 21: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Pre-training technique and semi-supervised learning

Proposed solution

Why is it difficult in practice?

⇒ Sequential transfer learning

Possible solution:

⇒ Parallel transfer learning

Why in parallel?

Interaction between tasksReduce the number of hyper-parameters to tuneProvide one stopping criterion

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 9/21

Page 22: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Parallel transfer learning: Tasks combination

Train cost = supervised task + unsupervised task︸ ︷︷ ︸reconstruction

l labeled samples, u unlabeled samples, wsh : shared parameters.

Reconstruction (auto-encoder) task:

Jr (D;w′ = {wsh,wr}) =l+u∑i=1

Cr (R(xi ;w′),xi) .

Supervised task:

Js(D;w = {wsh,ws}) =l∑

i=1

Cs(M(xi ;w),yi) .

Weighted tasks combination

J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .

λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/21

Page 23: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Parallel transfer learning: Tasks combination

Train cost = supervised task + unsupervised task︸ ︷︷ ︸reconstruction

l labeled samples, u unlabeled samples, wsh : shared parameters.

Reconstruction (auto-encoder) task:

Jr (D;w′ = {wsh,wr}) =l+u∑i=1

Cr (R(xi ;w′),xi) .

Supervised task:

Js(D;w = {wsh,ws}) =l∑

i=1

Cs(M(xi ;w),yi) .

Weighted tasks combination

J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .

λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/21

Page 24: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Parallel transfer learning: Tasks combination

Train cost = supervised task + unsupervised task︸ ︷︷ ︸reconstruction

l labeled samples, u unlabeled samples, wsh : shared parameters.

Reconstruction (auto-encoder) task:

Jr (D;w′ = {wsh,wr}) =l+u∑i=1

Cr (R(xi ;w′),xi) .

Supervised task:

Js(D;w = {wsh,ws}) =l∑

i=1

Cs(M(xi ;w),yi) .

Weighted tasks combination

J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .

λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/21

Page 25: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Parallel transfer learning: Tasks combination

Train cost = supervised task + unsupervised task︸ ︷︷ ︸reconstruction

l labeled samples, u unlabeled samples, wsh : shared parameters.

Reconstruction (auto-encoder) task:

Jr (D;w′ = {wsh,wr}) =l+u∑i=1

Cr (R(xi ;w′),xi) .

Supervised task:

Js(D;w = {wsh,ws}) =l∑

i=1

Cs(M(xi ;w),yi) .

Weighted tasks combination

J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .

λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/21

Page 26: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Tasks combination with evolving weightsWeighted tasks combination:

J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .

λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.

ProblemHow to fix λs, λr ?

IntuitionAt the end of the training, only Js should matters

Tasks combination with evolving weights (our contribution)

J (D; {wsh,ws,wr}) = λs(t) · Js(D; {wsh,ws}) + λr (t) · Jr (D; {wsh,wr}) .

t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 11/21

Page 27: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Tasks combination with evolving weightsWeighted tasks combination:

J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .

λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.

ProblemHow to fix λs, λr ?

IntuitionAt the end of the training, only Js should matters

Tasks combination with evolving weights (our contribution)

J (D; {wsh,ws,wr}) = λs(t) · Js(D; {wsh,ws}) + λr (t) · Jr (D; {wsh,wr}) .

t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 11/21

Page 28: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Tasks combination with evolving weightsWeighted tasks combination:

J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .

λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.

ProblemHow to fix λs, λr ?

IntuitionAt the end of the training, only Js should matters

Tasks combination with evolving weights (our contribution)

J (D; {wsh,ws,wr}) = λs(t) · Js(D; {wsh,ws}) + λr (t) · Jr (D; {wsh,wr}) .

t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 11/21

Page 29: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Tasks combination with evolving weights

J (D; {wsh,ws,wr}) = λs(t)·Js(D; {wsh,ws})+λr (t)·Jr (D; {wsh,wr}) .

start

0

0.2

0.4

0.6

0.8

1

Exponential schedule

Impo

rtan

cew

eigh

ts

t : Train epochs

λr (t)λs(t)

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 12/21

{λr (t) = exp(−t

σ) , σ : slope

λs(t) = 1− λr (t)

Page 30: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Proposed approach

Tasks combination with evolving weights: Optimization

Tasks combination with evolving weights (our contribution)

J (D; {wsh,ws,wr}) = λs(t) · Js(D; {wsh,ws}) + λr (t) · Jr (D; {wsh,wr}) .

t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.

Algorithm 1 Training our model for one epoch

1: D is the shuffled training set. B a mini-batch.2: for B in D do3: Make a gradient step toward Jr using B (update w′)4: Bs ⇐ labeled examples of B,5: Make a gradient step toward Js using Bs (update w)6: end for

[R.Caruana 97, J.Weston 08, R.Collobert 08, Z.Zhang 15]

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 13/21

Page 31: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Results

Experimental protocol

Objective: Compare Training DNN using different approaches:

No pre-training (base-line)With pre-training (Stairs schedule)Parallel transfer learning (proposed approach)

Studied evolving weights schedules:

0

1Stairs (Pre-training) Linear

start t10

1Linear until t1

start

Exponential

Impo

rtan

cew

eigh

ts

t : Train epochs

λrλs

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 14/21

Page 32: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Results

Experimental protocol

Task: Classification (MNIST)Number of hidden layers K : 1, 2, 3, 4.Optimization:

Epochs: 5000Batch size: 600Options: No regularization, No adaptive learning rate

Hyper-parameters of the evolving schedules:t1: 100 σ: 40

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 15/21

Page 33: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Results

Shallow networks: (K = 1, l = 1E2)

0 1E+03 2E+03 5E+03 1E+04 2E+04 4E+04 49900

Size of unlabeled data (u)

28.0

28.5

29.0

29.5

30.0

30.5

31.0

31.5

32.0

32.5

Calssification error MNIST test (%)

Evaluation of the eloving weight schedules (size of labeled data l=100), K=1

baseline

stairs100

lin100

lin

exp40

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 16/21

Page 34: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Results

Shallow networks: (K = 1, l = 1E3)

0 1E+03 2E+03 5E+03 1E+04 2E+04 4E+04 49900

Size of unlabeled data (u)

11.0

11.5

12.0

12.5

13.0

13.5

14.0

14.5

Calssification error MNIST test (%)

Evaluation of the eloving weight schedules (size of labeled data l=1000), K=1

baseline

stairs100

lin100

lin

exp40

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 17/21

Page 35: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Results

Deep networks: exponential schedule (l = 1E3)

0 1E+03 2E+03 5E+03 1E+04 2E+04 4E+04 49900

Size of unlabeled data (u)

10.0

10.5

11.0

11.5

12.0

12.5

13.0

Calssifica

tion e

rror M

NIS

T test

(%

)

Evaluation of the exp40 eloving weight schedule (size of labeled data l=1000)

K=2

K=3

K=4

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 18/21

Page 36: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Conclusion and perspectives

Conclusion

An alternative method to the pre-training.Parallel transfer learning with evolving weights

Improve generalization easily.Reduce the number of hyper-parameters (t1, σ)

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 19/21

Page 37: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Conclusion and perspectives

Perspectives

Evolve the importance weight according to the train/validation error.Explore other evolving schedules (toward automatic schedule)OptimizationExtension to structured output problems

Train cost = supervised task+ Input unsupervised task+ Output unsupervised task

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 20/21

Page 38: Pondération dynamique dans un cadre multi-tâche pour ... · pour réseaux de neurones profonds RFIA 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insa-rouen.fr

images/logos

Conclusion and perspectives

Questions

Thank you for your attention,

Questions?

LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 21/21