ValueError: Incompatible shapes between op input and calculated input gradient. #4

nikdnaik · 2018-04-02T19:07:43Z

While running the "cifar10_macro_search.sh" script, I get the following error. Is it related to the tensorflow version? I am using 1.6.0.

Build train graph
Tensor("child/layer_0/case/cond/Merge:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_1/skip/bn/Identity:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_2/skip/bn/Identity:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_3/pool_at_3/from_4/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_4/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_5/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_6/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_7/pool_at_7/from_8/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_8/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_9/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_10/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_11/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Model has 697860 params
Traceback (most recent call last):
File "src/cifar10/main.py", line 359, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "src/cifar10/main.py", line 355, in main
train()
File "src/cifar10/main.py", line 223, in train
ops = get_ops(images, labels)
File "src/cifar10/main.py", line 171, in get_ops
child_model.connect_controller(controller_model)
File "/home/nikhil/google_enas/src/cifar10/general_child.py", line 705, in connect_controller
self._build_train()
File "/home/nikhil/google_enas/src/cifar10/general_child.py", line 633, in _build_train
num_replicas=self.num_replicas)
File "/home/nikhil/google_enas/src/utils.py", line 125, in get_train_ops
grads = tf.gradients(loss, tf_variables)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py", line 641, in gradients
(op.name, i, t_in.shape, in_grad.shape))
ValueError: Incompatible shapes between op input and calculated input gradient. Forward operation: child/layer_11/case/cond/cond/cond/cond/cond/cond/Merge. Input index: 0. Original input shape: (). Calculated input gradient shape: (?, 36, 8, 8)

hyhieu · 2018-04-02T19:20:37Z

Thanks for noting this.

We just re-ran the script and didn't see the issue. We use TensorFlow 1.4.1 though.

We'll update to 1.7 soon and let you know if we have the problem.

nikdnaik · 2018-04-02T19:24:50Z

Thanks for the response. I suspect that it is related to this open bug tensorflow 1.5.0 onwards: tensorflow/tensorflow#15214

ahundt · 2018-04-10T20:01:09Z

I'm on 1.7 and I've been able to reproduce this error.

ahundt · 2018-04-16T16:04:13Z

@hyhieu @melodyguan sorry to bug you but would it be possible to give this a try? I'm unable to run the cifar10 model search due to this issue.

hyhieu · 2018-04-16T16:07:15Z

Hi @ahundt. We tried running the scripts, on TF 1.4, and didn't have this problem. We have heard of this version discrepancy issue. We'll fix it, but it will take a while.

ahundt · 2018-04-18T17:55:30Z

Yeah I had read the rest of this issue, I was just confirming the problem remains on 1.7. I'm able to start micro search without issue (haven't completed a full run yet) which calls the same function which is reporting the gradient input shape error, reproduced below.

Here is the macro search error I get:
error.txt


Model has 697860 params
Traceback (most recent call last):
  File "enas/cifar10/main.py", line 359, in <module>
    tf.app.run()
  File "/home/ahundt/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "enas/cifar10/main.py", line 355, in main
    train()
  File "enas/cifar10/main.py", line 223, in train
    ops = get_ops(images, labels)
  File "enas/cifar10/main.py", line 171, in get_ops
    child_model.connect_controller(controller_model)
  File "/home/ahundt/src/enas/enas/cifar10/general_child.py", line 705, in connect_controller
    self._build_train()
  File "/home/ahundt/src/enas/enas/cifar10/general_child.py", line 633, in _build_train
    num_replicas=self.num_replicas)
  File "/home/ahundt/src/enas/enas/utils.py", line 127, in get_train_ops
    grads = tf.gradients(loss, tf_variables)
  File "/home/ahundt/.local/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 488, in gradients
    gate_gradients, aggregation_method, stop_gradients)
  File "/home/ahundt/.local/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 655, in _GradientsHelper
    (op.name, i, t_in.shape, in_grad.shape))
ValueError: Incompatible shapes between op input and calculated input gradient.  Forward operation: child/layer_11/case/cond/cond/cond/cond/cond/cond/Merge.  Input index: 0. Original input shape: ().  Calculated input gradient shape: (?, 36, 8, 8)

AbelardLiu · 2018-04-21T06:54:48Z

I guess that this problem is caused by the tensor
”child/layer_11/case/cond/cond/cond/cond/cond/cond/Constant“ has the shape () which is from the follow code
out = tf.case(branches, default=lambda: tf.constant(0, tf.float32), exclusive=True)
I change this line into the following can run enas.
out = tf.case(branches, default=lambda: tf.constant(0, tf.float32, shape=[self.batch_size, out_filters, inp_h, inp_w]), exclusive=True)
But this change only support data-format which is "NCHW".

melodyguan#4

ahundt · 2018-04-22T23:09:36Z

@AbelardLiu Awesome! That gave me the info I needed to create a proper fix which should work in all cases, see #29

AbelardLiu · 2018-04-23T01:27:54Z

@ahundt Great!I'll merge this commit and test,
Btw, do you run enas use python2 or python3?
Thansks!

ahundt · 2018-04-25T00:36:56Z

I run on python2

harewei · 2018-05-16T06:29:39Z

By the way, I'm using Python3.6.5 with TensorFlow 1.7.0, and the fix by @ahundt does indeed fix this issue.

shiyongde · 2018-08-10T12:29:56Z

the same bug whith Tensorflow 1.8 cuda9.1

Build train graph
Tensor("child/layer_0/case/cond/Merge:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_1/skip/bn/Identity:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_2/skip/bn/Identity:0", shape=(?, 36, 32, 32), dtype=float32)
Tensor("child/layer_3/pool_at_3/from_4/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_4/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_5/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_6/skip/bn/Identity:0", shape=(?, 36, 16, 16), dtype=float32)
Tensor("child/layer_7/pool_at_7/from_8/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_8/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_9/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_10/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Tensor("child/layer_11/skip/bn/Identity:0", shape=(?, 36, 8, 8), dtype=float32)
Model has 697860 params
Traceback (most recent call last):
File "src/cifar10/main.py", line 359, in
tf.app.run()
File "/data5/xiawei/program/python2/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "src/cifar10/main.py", line 355, in main
train()
File "src/cifar10/main.py", line 223, in train
ops = get_ops(images, labels)
File "src/cifar10/main.py", line 171, in get_ops
child_model.connect_controller(controller_model)
File "/data3/xiawei/work/enas/enas/src/cifar10/general_child.py", line 705, in connect_controller
self._build_train()
File "/data3/xiawei/work/enas/enas/src/cifar10/general_child.py", line 633, in _build_train
num_replicas=self.num_replicas)
File "/data3/xiawei/work/enas/enas/src/utils.py", line 125, in get_train_ops
grads = tf.gradients(loss, tf_variables)
File "/data5/xiawei/program/python2/local/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 494, in gradients
gate_gradients, aggregation_method, stop_gradients)
File "/data5/xiawei/program/python2/local/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 669, in _GradientsHelper
(op.name, i, t_in.shape, in_grad.shape))
ValueError: Incompatible shapes between op input and calculated input gradient. Forward operation: child/layer_11/case/cond/cond/cond/cond/cond/cond/Merge. Input index: 0. Original input shape: (). Calculated input gradient shape: (?, 36, 8, 8)

upwindflys · 2019-03-05T07:09:02Z

@AbelardLiu Thanks a lot,Fix works for me.But why out.set_shape does not work,I feel confused.

pingguokiller · 2019-11-11T07:51:02Z

@AbelardLiu Thanks a lot,Fix works for me.But why out.set_shape does not work,I feel confused.

Use the code below is OK on tf1.13.
out = tf.reshape(out, (-1, inp_h, inp_w, out_filters))

"tensor.set_shape" is just to set the static shape. At most situations, we should use "tf.reshape" to set the dynamic shape.

maryam089 · 2020-09-15T15:54:12Z

@ahundt I'm using Python3.6 with TensorFlow 1.5.1 and my issue is
[Report Error]ValueError: Incompatible shapes between op input and calculated input gradient. conv2d_transpose
any idea how to solve it ?

maryam089 · 2020-09-16T02:38:44Z

The problem has been solved. The issue was with the strides

laksheenmendis · 2021-04-23T00:33:34Z

I also got the same issue with Python 3.7 with TensorFlow 1.15.2, and the fix by @ahundt (#29) fixed the issue. And I believe it should be merged to the master branch for future users (I spent a lot of time on this issue)

…dyguan#4 melodyguan#29 taken from melodyguan@ad9ec6c

This was referenced Apr 10, 2018

src->enas #17

Open

tf.profiler overrides shape_invariants in while_loop tensorflow/tensorflow#15214

Closed

ahundt added a commit to ahundt/enas that referenced this issue Apr 22, 2018

cifar10/general_child.py shape fix for tf1.5 and higher resolves #4

ad9ec6c

melodyguan#4

ahundt linked a pull request Apr 22, 2018 that will close this issue

cifar10/general_child.py shape fix for tf1.5 and higher resolves #4 #29

Open

funasoul added a commit to funasoul/enas that referenced this issue Nov 4, 2022

cifar10/general_child.py shape fix for tf1.5 and higher resolves melo…

840fabc

…dyguan#4 melodyguan#29 taken from melodyguan@ad9ec6c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Incompatible shapes between op input and calculated input gradient. #4

ValueError: Incompatible shapes between op input and calculated input gradient. #4

nikdnaik commented Apr 2, 2018

hyhieu commented Apr 2, 2018

nikdnaik commented Apr 2, 2018

ahundt commented Apr 10, 2018 •

edited

Loading

ahundt commented Apr 16, 2018

hyhieu commented Apr 16, 2018

ahundt commented Apr 18, 2018 •

edited

Loading

AbelardLiu commented Apr 21, 2018

ahundt commented Apr 22, 2018

AbelardLiu commented Apr 23, 2018

ahundt commented Apr 25, 2018

harewei commented May 16, 2018

shiyongde commented Aug 10, 2018

upwindflys commented Mar 5, 2019

pingguokiller commented Nov 11, 2019

maryam089 commented Sep 15, 2020 •

edited

Loading

maryam089 commented Sep 16, 2020

laksheenmendis commented Apr 23, 2021

ValueError: Incompatible shapes between op input and calculated input gradient. #4

ValueError: Incompatible shapes between op input and calculated input gradient. #4

Comments

nikdnaik commented Apr 2, 2018

While running the "cifar10_macro_search.sh" script, I get the following error. Is it related to the tensorflow version? I am using 1.6.0.

hyhieu commented Apr 2, 2018

nikdnaik commented Apr 2, 2018

ahundt commented Apr 10, 2018 • edited Loading

ahundt commented Apr 16, 2018

hyhieu commented Apr 16, 2018

ahundt commented Apr 18, 2018 • edited Loading

AbelardLiu commented Apr 21, 2018

ahundt commented Apr 22, 2018

AbelardLiu commented Apr 23, 2018

ahundt commented Apr 25, 2018

harewei commented May 16, 2018

shiyongde commented Aug 10, 2018

upwindflys commented Mar 5, 2019

pingguokiller commented Nov 11, 2019

maryam089 commented Sep 15, 2020 • edited Loading

maryam089 commented Sep 16, 2020

laksheenmendis commented Apr 23, 2021

ahundt commented Apr 10, 2018 •

edited

Loading

ahundt commented Apr 18, 2018 •

edited

Loading

maryam089 commented Sep 15, 2020 •

edited

Loading