[FEATURE] Enhance Flint query profiling and acceleration insights #997

dai-chen · 2024-12-19T21:13:46Z

Is your feature request related to a problem?

Flint users often struggle to understand how their SparkSQL queries benefit from acceleration indices. They lack clarity on whether their queries are being optimized by Flint’s indices or running as direct queries. Furthermore, users have limited visibility into which acceleration index, if any, is enhancing query performance. Troubleshooting performance issues typically requires opening the Spark dashboard UI to inspect job, stage, and task metrics, which is both time-consuming and inefficient.

What solution would you like?

Provide advanced APIs to understand acceleration such as "What-If" and "WhyNot" API that help users assess potential performance improvements and understand why existing acceleration cannot be applied to the given query.
Expose more Spark metrics in query execution history to users, including the missing metrics from Flint reader and writer.
More logging in Flint optimizer rule within Flint's query rewrite logic to explicitly with more details for troubleshooting.
Provide a user-friendly interface to display query execution history and related metrics, making it easier for users to analyze performance and acceleration impacts. Ref: https://docs.databricks.com/en/sql/user/queries/query-history.html

What alternatives have you considered?

Provide the link to Spark dashboard to allow users to directly check query execution details and independently analyze performance metrics and identify bottlenecks.
Provide guidance for users to troubleshoot acceleration on their own as follows.

Do you have any additional context?

The following questions from users highlight the need for this feature:

How do I know if I am performing a direct query or if my query is benefiting from accelerated indices?
How do I determine which acceleration index helped boost my query performance?
How can I check if my query is performing faster before and after creating accelerations?
To see performance benefits, should I execute the query in OpenSearch SQL workbench or Dev Tools?

dai-chen added enhancement New feature or request untriaged performance Make it fast! and removed untriaged labels Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Enhance Flint query profiling and acceleration insights #997

[FEATURE] Enhance Flint query profiling and acceleration insights #997

dai-chen commented Dec 19, 2024 •

edited

Loading

[FEATURE] Enhance Flint query profiling and acceleration insights #997

[FEATURE] Enhance Flint query profiling and acceleration insights #997

Comments

dai-chen commented Dec 19, 2024 • edited Loading

dai-chen commented Dec 19, 2024 •

edited

Loading