Often you want to fix the weight of your network in Keras and learn only another layer. It is a memo that I investigated what to be careful about at that time.
Consider the following Model.

Suppose you want to "update" the Weight of the NormalContainer part here, and sometimes you don't want to update it.
Intuitively, it seems good to set False to the Property Container # trainable, but I will try to see if it works as intended.
# coding: utf8
import numpy as np
from keras.engine.topology import Input, Container
from keras.engine.training import Model
from keras.layers.core import Dense
from keras.utils.vis_utils import plot_model
def all_weights(m):
return [list(w.reshape((-1))) for w in m.get_weights()]
def random_fit(m):
x1 = np.random.random(10).reshape((5, 2))
y1 = np.random.random(5).reshape((5, 1))
m.fit(x1, y1, verbose=False)
np.random.seed(100)
x = in_x = Input((2, ))
# Create 2 Containers shared same wights
x = Dense(1)(x)
x = Dense(1)(x)
fc_all = Container(in_x, x, name="NormalContainer")
fc_all_not_trainable = Container(in_x, x, name="FixedContainer")
# Create 2 Models using the Containers
x = fc_all(in_x)
x = Dense(1)(x)
model_normal = Model(in_x, x)
x = fc_all_not_trainable(in_x)
x = Dense(1)(x)
model_fixed = Model(in_x, x)
# Set one Container trainable=False
fc_all_not_trainable.trainable = False # Case1
# Compile
model_normal.compile(optimizer="sgd", loss="mse")
model_fixed.compile(optimizer="sgd", loss="mse")
# fc_all_not_trainable.trainable = False # Case2
# Watch which weights are updated by model.fit
print("Initial Weights")
print("Model-Normal: %s" % all_weights(model_normal))
print("Model-Fixed : %s" % all_weights(model_fixed))
random_fit(model_normal)
print("after training Model-Normal")
print("Model-Normal: %s" % all_weights(model_normal))
print("Model-Fixed : %s" % all_weights(model_fixed))
random_fit(model_fixed)
print("after training Model-Fixed")
print("Model-Normal: %s" % all_weights(model_normal))
print("Model-Fixed : %s" % all_weights(model_fixed))
# plot_model(model_normal, "model_normal.png ", show_shapes=True)
Create two Containers, fc_all and fc_all_not_trainable. The latter leaves trainable set to False.
Create Model called model_normal and model_fixed using it.
The expected behavior is
model_normal isfit (), each Container and other Weights change.model_fixed isfit (), the Weight of each Container does not change, and the other Weights change.That is.
| Container Weight | Other Weight | |
|---|---|---|
| model_normal#fit() | Change | Change |
| model_fixed#fit() | It does not change | Change |
Initial Weights
Model-Normal: [[1.2912766, -0.53409958], [0.0], [-0.1305927], [0.0], [-0.21052945], [0.0]]
Model-Fixed : [[1.2912766, -0.53409958], [0.0], [-0.1305927], [0.0], [0.37929809], [0.0]]
after training Model-Normal
Model-Normal: [[1.2913349, -0.53398848], [0.00016010582], [-0.13071491], [-0.0012259937], [-0.21060525], [0.0058233831]]
Model-Fixed : [[1.2913349, -0.53398848], [0.00016010582], [-0.13071491], [-0.0012259937], [0.37929809], [0.0]]
after training Model-Fixed
Model-Normal: [[1.2913349, -0.53398848], [0.00016010582], [-0.13071491], [-0.0012259937], [-0.21060525], [0.0058233831]]
Model-Fixed : [[1.2913349, -0.53398848], [0.00016010582], [-0.13071491], [-0.0012259937], [0.37869808], [0.0091063408]]
As expected.
, all parts other than the [0.37929809] and [0.0]` parts of Model-Fixed have changed., on the contrary, only the [0.37929809], [0.0]` part of Model-Fixed has changed.What if you set trainable = False afterModel # compile ()(where Case 2 is) in the above code?
Initial Weights
Model-Normal: [[1.2912766, -0.53409958], [0.0], [-0.1305927], [0.0], [-0.21052945], [0.0]]
Model-Fixed : [[1.2912766, -0.53409958], [0.0], [-0.1305927], [0.0], [0.37929809], [0.0]]
after training Model-Normal
Model-Normal: [[1.2913349, -0.53398848], [0.00016010582], [-0.13071491], [-0.0012259937], [-0.21060525], [0.0058233831]]
Model-Fixed : [[1.2913349, -0.53398848], [0.00016010582], [-0.13071491], [-0.0012259937], [0.37929809], [0.0]]
after training Model-Fixed
Model-Normal: [[1.2910744, -0.53420025], [-0.0002913858], [-0.12900624], [0.0022280237], [-0.21060525], [0.0058233831]]
Model-Fixed : [[1.2910744, -0.53420025], [-0.0002913858], [-0.12900624], [0.0022280237], [0.37869808], [0.0091063408]]
Same up to ʻafter training Model-Normal, When ʻafter training Model-Fixed, the weight of Container also changes.
Model # compile () works to retrieve trainable_weights from all contained Layers when called.
Therefore, if you do not set trainable at that point, it will be meaningless.
Another point is that ** it is not necessary to set trainable for all layers included in Container **. Container is one layer when viewed from Model. Model calls Container # trainable_weights, but returns nothing if Container # trainable is False (corresponding /keras/engine/topology.py#L1891)), so all Layer Weights contained in Container will not be updated. It's a bit unclear if this is a spec or just the implementation at this stage, but I think it's probably intentional.
The slight haze has been resolved.
Recommended Posts