-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Incompatible shapes between op input and calculated input gradient. #4
Comments
Thanks for noting this. We just re-ran the script and didn't see the issue. We use TensorFlow 1.4.1 though. We'll update to 1.7 soon and let you know if we have the problem. |
Thanks for the response. I suspect that it is related to this open bug tensorflow 1.5.0 onwards: tensorflow/tensorflow#15214 |
I'm on 1.7 and I've been able to reproduce this error. |
@hyhieu @melodyguan sorry to bug you but would it be possible to give this a try? I'm unable to run the cifar10 model search due to this issue. |
Hi @ahundt. We tried running the scripts, on TF 1.4, and didn't have this problem. We have heard of this version discrepancy issue. We'll fix it, but it will take a while. |
Yeah I had read the rest of this issue, I was just confirming the problem remains on 1.7. I'm able to start micro search without issue (haven't completed a full run yet) which calls the same function which is reporting the gradient input shape error, reproduced below. Here is the macro search error I get:
|
I guess that this problem is caused by the tensor |
@AbelardLiu Awesome! That gave me the info I needed to create a proper fix which should work in all cases, see #29 |
@ahundt Great!I'll merge this commit and test, |
I run on python2 |
By the way, I'm using Python3.6.5 with TensorFlow 1.7.0, and the fix by @ahundt does indeed fix this issue. |
the same bug whith Tensorflow 1.8 cuda9.1 Build train graph |
@AbelardLiu Thanks a lot,Fix works for me.But why out.set_shape does not work,I feel confused. |
Use the code below is OK on tf1.13. "tensor.set_shape" is just to set the static shape. At most situations, we should use "tf.reshape" to set the dynamic shape. |
@ahundt I'm using Python3.6 with TensorFlow 1.5.1 and my issue is |
The problem has been solved. The issue was with the strides |
While running the "cifar10_macro_search.sh" script, I get the following error. Is it related to the tensorflow version? I am using 1.6.0.
Build train graph
Tensor("child/layer_0/case/cond/Merge:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_1/skip/bn/Identity:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_2/skip/bn/Identity:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_3/pool_at_3/from_4/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_4/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_5/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_6/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_7/pool_at_7/from_8/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_8/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_9/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_10/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_11/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Model has 697860 params
Traceback (most recent call last):
File "src/cifar10/main.py", line 359, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "src/cifar10/main.py", line 355, in main
train()
File "src/cifar10/main.py", line 223, in train
ops = get_ops(images, labels)
File "src/cifar10/main.py", line 171, in get_ops
child_model.connect_controller(controller_model)
File "/home/nikhil/google_enas/src/cifar10/general_child.py", line 705, in connect_controller
self._build_train()
File "/home/nikhil/google_enas/src/cifar10/general_child.py", line 633, in _build_train
num_replicas=self.num_replicas)
File "/home/nikhil/google_enas/src/utils.py", line 125, in get_train_ops
grads = tf.gradients(loss, tf_variables)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py", line 641, in gradients
(op.name, i, t_in.shape, in_grad.shape))
ValueError: Incompatible shapes between op input and calculated input gradient. Forward operation: child/layer_11/case/cond/cond/cond/cond/cond/cond/Merge. Input index: 0. Original input shape: (). Calculated input gradient shape: (?, 36, 8, 8)
The text was updated successfully, but these errors were encountered: