我在Keras中使用VGG16架构,我已经通过以下方式重新训练以满足我的需求:
vgg16_model = keras.applications.vgg16.VGG16()model = Sequential()for layer in vgg16_model.layers: model.add(layer)model.layers.pop()for layer in model.layers: layer.trainable = Falsemodel.add(Dense(3, activation='softmax'))model.compile(Adam(lr=.0001), loss='categorical_crossentropy', metrics=['accuracy'])
接下来我训练模型,然后按照keras文档中的建议方式保存整个模型:
from keras.models import load_modelmodel.save('my_model_vgg16.h5') # creates a HDF5 file
在加载模型的同时:
model = load_model('my_model_vgg16.h5')
在JupyterNotebook中使用经过训练的模型就像一个魅力.但是,当我在重新启动内核后尝试加载已保存的模型时,我收到以下错误:
ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
我无法弄清楚为什么会出现这种错误,因为在保存和加载过程中我既没有更改模型/图层的输入也没有输出大小.
出于测试目的,我尝试使用一个更简单的顺序模型,我在同一个pipleline中从头开始构建(即相同的保存和加载过程),这没有给我任何错误.因此,我想知道在使用预训练模型(保存并加载它)时是否存在我遗漏的东西.
作为参考,整个控制台错误日志如下所示:
---------------------------------------------------------------------------InvalidArgumentError Traceback (most recent call last)~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn) 685 graph_def_version, node_def_str, input_shapes, input_tensors,--> 686 input_tensors_as_shapes, status) 687 except errors.InvalidArgumentError as err:~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg) 472 compat.as_text(c_api.TF_Message(self.status.status)),--> 473 c_api.TF_GetCode(self.status.status)) 474 # Delete the underlying status object from memory otherwise it stays aliveInvalidArgumentError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].During handling of the above exception, another exception occurred:ValueError Traceback (most recent call last) in () 1 from keras.models import load_model----> 2 loaded_model = load_model('my_model_vgg16.h5') 3 print("Loaded Model from disk") 4 5 #compile and evaluate loaded model~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\models.py in load_model(filepath, custom_objects, compile) 244 245 # set weights--> 246 topology.load_weights_from_hdf5_group(f['model_weights'], model.layers) 247 248 # Early return if compilation is not required.~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\engine\topology.py in load_weights_from_hdf5_group(f, layers) 3164 ' elements.') 3165 weight_value_tuples += zip(symbolic_weights, weight_values)-> 3166 K.batch_set_value(weight_value_tuples) 3167 3168 ~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\backend\tensorflow_backend.py in batch_set_value(tuples) 2363 assign_placeholder = tf.placeholder(tf_dtype, 2364 shape=value.shape)-> 2365 assign_op = x.assign(assign_placeholder) 2366 x._assign_placeholder = assign_placeholder 2367 x._assign_op = assign_op~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\variables.py in assign(self, value, use_locking) 571 the assignment has completed. 572 """--> 573 return state_ops.assign(self._variable, value, use_locking=use_locking) 574 575 def assign_add(self, delta, use_locking=False):~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\state_ops.py in assign(ref, value, validate_shape, use_locking, name) 274 return gen_state_ops.assign( 275 ref, value, use_locking=use_locking, name=name,--> 276 validate_shape=validate_shape) 277 return ref.assign(value)~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\gen_state_ops.py in assign(ref, value, validate_shape, use_locking, name) 54 _, _, _op = _op_def_lib._apply_op_helper( 55 "Assign", ref=ref, value=value, validate_shape=validate_shape,---> 56 use_locking=use_locking, name=name) 57 _result = _op.outputs[:] 58 _inputs_flat = _op.inputs~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) 785 op = g.create_op(op_type_name, inputs, output_types, name=scope, 786 input_types=input_types, attrs=attr_protos,--> 787 op_def=op_def) 788 return output_structure, op_def.is_stateful, op 789 ~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device) 2956 op_def=op_def) 2957 if compute_shapes:-> 2958 set_shapes_for_outputs(ret) 2959 self._add_op(ret) 2960 self._record_op_seen_by_control_dependencies(ret)~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in set_shapes_for_outputs(op) 2207 shape_func = _call_cpp_shape_fn_and_require_op 2208 -> 2209 shapes = shape_func(op) 2210 if shapes is None: 2211 raise RuntimeError(~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in call_with_requiring(op) 2157 2158 def call_with_requiring(op):-> 2159 return call_cpp_shape_fn(op, require_shape_fn=True) 2160 2161 _call_cpp_shape_fn_and_require_op = call_with_requiring~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in call_cpp_shape_fn(op, require_shape_fn) 625 res = _call_cpp_shape_fn_impl(op, input_tensors_needed, 626 input_tensors_as_shapes_needed,--> 627 require_shape_fn) 628 if not isinstance(res, dict): 629 # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn) 689 missing_shape_fn = True 690 else:--> 691 raise ValueError(err.message) 692 693 if missing_shape_fn:ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
解决方法:
问题在于line.layers.pop().直接从列表model.layers弹出图层时,此模型的拓扑不会相应更新.因此,如果模型定义错误,以下所有操作都将是错误的.
具体来说,当您使用model.add(图层)添加图层时,列表model.outputs将更新为该图层的输出张量.您可以在Sequential.add()的源代码中找到以下行:
output_tensor = layer(self.outputs[0]) # ... skipping irrelevant lines self.outputs = [output_tensor]
但是,当您调用model.layers.pop()时,不会相应地更新model.outputs.结果,将使用错误的输入张量调用下一个添加的图层(因为self.outputs [0]仍然是已删除图层的输出张量).
这可以通过以下几行来证明:
model = Sequential()for layer in vgg16_model.layers: model.add(layer)model.layers.pop()model.add(Dense(3, activation='softmax'))print(model.layers[-1].input)# => Tensor("predictions_1/Softmax:0", shape=(?, 1000), dtype=float32)# the new layer is called on a wrong input tensorprint(model.layers[-1].kernel)# => # the kernel shape is also wrong
不正确的内核形状是您看到有关不兼容形状[4096,3]与[1000,3]的错误的原因.
要解决此问题,只需将最后一层添加到Sequential模型即可.
model = Sequential()for layer in vgg16_model.layers[:-1]: model.add(layer)