Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack Overflow Occurs When Calling arm_convolve_s8 Function #161

Open
Ijustakid opened this issue Jan 2, 2025 · 3 comments
Open

Stack Overflow Occurs When Calling arm_convolve_s8 Function #161

Ijustakid opened this issue Jan 2, 2025 · 3 comments
Assignees

Comments

@Ijustakid
Copy link

I attempted to execute a convolutional neural network on a Cortex-M55, where the input size is 1x3x256x144 (NCHW), the kernel size is 12x3x3x3, padding is (1, 1), and stride is (1, 1). The output size is 1x12x256x144 (NCHW).

The compilation command executes correctly, and the compilation .bin are successfully generated.

However, during runtime, a stack overflow crash occurs.

When I reduce the input size to 1x1x16x16 while keeping the kernel size unchanged, the crash does not occur.

Could it be that the arm_convolve_s8 function's recursive depth is too deep when dealing with large feature maps, causing the stack overflow?

Do you have any relevant experience with this issue?

BRs,
Thanks!

@ArmRyan ArmRyan self-assigned this Jan 16, 2025
@ArmRyan
Copy link
Collaborator

ArmRyan commented Jan 20, 2025

Hi @Ijustakid , I have tried to reproduce this and failed. Could you provide more input or a sample model that you see this failure in? Perhaps the model is just too large for the device that you are running on or you are allocating memory incorrectly on your platform?

@Ijustakid
Copy link
Author

Hi @Ijustakid , I have tried to reproduce this and failed. Could you provide more input or a sample model that you see this failure in? Perhaps the model is just too large for the device that you are running on or you are allocating memory incorrectly on your platform?

Yes, it is indeed my problem, my device RAM doesn't have that much space to allocate variables. Thanks for your reply.

I have another question, is the data format supported by CMSIS_NN NCHW or NHWC?
For example, for conv's inputs and weights which are 2D structure data, how to arrange them into 1D array variables?

Thanks!

BRs

@ArmRyan
Copy link
Collaborator

ArmRyan commented Jan 24, 2025

The data format should be NHWC just like tflite micro's default format! We don't currently support any NCHW kernels.

So if it helps, when we create rows for a gemm kernel in a convolution we would do something similar to
For H
For W
memcpy(length(channels))

Source: https://github.com/ARM-software/CMSIS-NN/blob/main/Source/ConvolutionFunctions/arm_convolve_s4.c#L113-L128

Is it safe to assume you are not using tflite micro if the data format is NCHW? cmsis-nn is designed to be used alongside tflm. So the easiest way to access the library would be to build it through tflite micro. It should also fail to compile before it gets to the point where it could stack overflow if there is insufficient memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants