Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why clamp inputs to all tanh calls? #1

Open
dhruvdcoder opened this issue Apr 24, 2019 · 1 comment
Open

Why clamp inputs to all tanh calls? #1

dhruvdcoder opened this issue Apr 24, 2019 · 1 comment

Comments

@dhruvdcoder
Copy link

return tf.tanh(tf.minimum(tf.maximum(x, -MAX_TANH_ARG), MAX_TANH_ARG))

We are trying to reimplement the layers proposed by the Hyperbolic Neural Networks paper. We use float64 instead of float32 for the entire model and inputs. Hence, we avoid numerical instability. However, if we do not clamp the inputs to the tanh functions between (-15, 15), the network does not seem to train at all. It would be great if you could provide a reason for doing this and for picking the value of 15.

PS: I really liked the paper and thank you for making the code available.

@octavian-ganea
Copy link
Contributor

octavian-ganea commented Apr 24, 2019

I am not fully sure, but I believe the reason is that exp(15) is very large and exp(-15) is very small, so, in this range, small updates in the exponent can result in very large/tiny fluctuations which can lead to numerical instabilities, overflow/underflow. Moreover, gradient flow might be very noisy (for exp(15)) or 0 (for exp(-15)), which can mean learning signal is very poor. Anyways, MAXINT32 ~= exp(21).

Using float64 would help, but would double the training speed ....

One solution to this problem is to always gyro-translate your embeddings to have 0 mean. Or increase dimensionality (see https://arxiv.org/abs/1804.03329 ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants