Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maybe the code itself support training with text length > 26 #50

Open
ghost opened this issue Apr 24, 2020 · 6 comments
Open

maybe the code itself support training with text length > 26 #50

ghost opened this issue Apr 24, 2020 · 6 comments

Comments

@ghost
Copy link

ghost commented Apr 24, 2020

@Holmeyoung
in #17 you mentioned that your codes only support training with text length <= 26, I found that
(1) when resize the images to 100X32. length of the raw character output is 26. so we cannot train with text length > 26.
0

(2) when keep_ratio = True, only the height of the image is resized to 32, the width of the image is not fixed and vary for different images. so length of the raw character output is not fixed and depends on the width of the image, maybe we can train with any text length
2
3

conclusion: we can train with any text length when we set keep_ratio = True during training

Thank you so much.

@Holmeyoung
Copy link
Owner

Hi, in fact, it only depends on the output lstm T length.

@ghost
Copy link
Author

ghost commented Apr 25, 2020

thanks for your reply.
if we don't change the network, will the output lstm T length only depend on the width of the image?

@ghost
Copy link
Author

ghost commented Apr 25, 2020

I got the answer from your reply from #17

You need to calculate it. After conv and pool what's the image width. The image width will be T length in rnn.

then output width of the CRNN().cnn will be the T length? and text length should not exceed T?
is what I said here right? thank you so much.

        self.cnn = cnn
        self.rnn = nn.Sequential(
            BidirectionalLSTM(512, nh, nh),
            BidirectionalLSTM(nh, nh, nclass))

@Holmeyoung
Copy link
Owner

I got the answer from your reply from #17

You need to calculate it. After conv and pool what's the image width. The image width will be T length in rnn.

then output width of the CRNN().cnn will be the T length? and text length should not exceed T?
is what I said here right? thank you so much.

        self.cnn = cnn
        self.rnn = nn.Sequential(
            BidirectionalLSTM(512, nh, nh),
            BidirectionalLSTM(nh, nh, nclass))

Yeah

@ghost
Copy link
Author

ghost commented Apr 26, 2020

Got it, thank you so much.

@ghost ghost closed this as completed Apr 26, 2020
@ghost
Copy link
Author

ghost commented May 21, 2020

@Holmeyoung
hi, the output width of the CRNN().cnn is T, and text length should not exceed T.
my question is: if the text length is larger than T, then there will be errors? or we can still train the model?

thank you so much.

@ghost ghost reopened this May 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant