Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance: sdn tracing 400+ EVCs (with INT) on PUT /traces, the 95th percentile response time is around 6 seconds #120

Open
viniarck opened this issue Jul 30, 2024 · 2 comments
Labels
future_release Planned for the next release

Comments

@viniarck
Copy link
Member

  • The 95th percentile when tracing 400+ EVCs with INT, in total 12k ish flows on all switches, it was around 6 secs. This isn't an immediate major issue, but if we end up using sdntrace cp in bulk for the consistency check atomically holding a lock for that long wouldn't be great, it'll depend on if and how it'll run, but if it needs to be atomically thread safe for all EVCs being checked it can lead to overall slowness.

Current evident bottle necks (let me register here for future discussions):

  1. The vast majority of the time is spent on sdntrace_cp. The lookup algorithm time complexity is roughly:

O(switches) * O(switch[table_size]) * O(switch[goto_tables])

The biggest one, which is linear, is usually O(switch[table_size]), if we didn't have to support matching with bitmasks we could leverage a flow match_id indexed and perform in O(1) per switch, but we'll still need to maintain with bit mask since mef_eline uses it with vlan range and other special tags. Unless, it also had an endpoint to not trace anything with bitmasks involved.

  1. flow_manager stored_flows took 1.7 secs, and looks like it was mostly latency in the API and not in the DB, that would still looking into is the response was too large and json wasn't too fast to serialize, in the future, sdntrace_cp querying flow_manager directly in its FlowController should help too.

20240730_110744

@viniarck viniarck added the future_release Planned for the next release label Jul 30, 2024
@viniarck
Copy link
Member Author

cc'ing @Ktmi for his information. Something to keep in mind when epic epic_mef_eline_consistency_v2 resumes.

@viniarck
Copy link
Member Author

I've just caught another one even worse took almost 9 secs:

20240730_113719

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
future_release Planned for the next release
Projects
None yet
Development

No branches or pull requests

1 participant