-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pg: add returning_id option to parallel_execute #181
base: master
Are you sure you want to change the base?
pg: add returning_id option to parallel_execute #181
Conversation
66f85ef
to
5f07404
Compare
The default behavior is unchanged. This adds the possibility to parallelize modifying queries that have a `RETURNING id` clause. For those, return the resulting ids (in a defined order) instead of the affected row count. To avoid misuse add a warning to the docstring and try to detect queries other than the ones of the intended form. Raise an error if such are found.
5f07404
to
343aa7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's a draft, couldn't help peeking.
Could you further elaborate on the need to limit this to returning ids only?
@@ -154,15 +154,20 @@ def parallel_execute(cr, queries, logger=_logger): | |||
|
|||
:param list(str) queries: list of queries to execute concurrently | |||
:param `~logging.Logger` logger: logger used to report the progress | |||
:return: the sum of `cr.rowcount` for each query run | |||
:param bool returning_id: wether to return a tuple of affected ids (default: return affected row count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param bool returning_id: wether to return a tuple of affected ids (default: return affected row count) | |
:param bool returning_id: whether to return a tuple of affected ids (default: return affected row count) |
- As a side effect, the cursor will be committed. | ||
- It would not be generally safe to use this function for selecting queries. Because of this, | ||
`returning_id=True` is only accepted for `UPDATE/DELETE/INSERT/MERGE [...] RETURNING id` queries. Also, the | ||
caller cannot influnce the order of the returned result tuples, it is always sorted in ascending order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
caller cannot influnce the order of the returned result tuples, it is always sorted in ascending order. | |
caller cannot influence the order of the returned result tuples, it is always sorted in ascending order. |
raise ValueError("The returning_id parameter can only be used with certain queries.") | ||
|
||
res = parallel_execute_impl(cr, queries, logger, returning_id) | ||
return tuple(sorted([id for (id,) in res])) if returning_id else res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return tuple(sorted([id for (id,) in res])) if returning_id else res | |
return tuple(id for (id,) in res) if returning_id else res |
Why sorting these?
if returning_id: | ||
returning_id_re = re.compile(r"(?s)(?:UPDATE|DELETE|INSERT|MERGE).*RETURNING\s+\S*\.?id\s*$") | ||
if not all((bool(returning_id_re.search(q)) for q in queries)): | ||
raise ValueError("The returning_id parameter can only be used with certain queries.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if returning_id: | |
returning_id_re = re.compile(r"(?s)(?:UPDATE|DELETE|INSERT|MERGE).*RETURNING\s+\S*\.?id\s*$") | |
if not all((bool(returning_id_re.search(q)) for q in queries)): | |
raise ValueError("The returning_id parameter can only be used with certain queries.") | |
returning_id_re = re.compile(r"(?s)(?:UPDATE|DELETE|INSERT|MERGE).*RETURNING\s+\S*\.?id\s*$") | |
returning_id = all((bool(returning_id_re.search(q)) for q in queries)) |
How about we automatically detect whether the queries at hand are returning something?
return parallel_execute_impl(cr, queries, logger=_logger) | ||
|
||
if returning_id: | ||
returning_id_re = re.compile(r"(?s)(?:UPDATE|DELETE|INSERT|MERGE).*RETURNING\s+\S*\.?id\s*$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returning_id_re = re.compile(r"(?s)(?:UPDATE|DELETE|INSERT|MERGE).*RETURNING\s+\S*\.?id\s*$") | |
returning_id_re = re.compile(r"(?s)(?:UPDATE|DELETE|INSERT|MERGE).*RETURNING\s+(?:\w+\.)?id\s*$") |
Change \S*\.?
to (?:\w+\.)?
to avoid false positive like "UPDATE ... RETURNING name,id"
and "DELETE FROM ... RETURNING report_id"
.
For the record: I think this is not necessary. See more comments in the linked PR in upgrades. |
The default behavior is unchanged.
This adds the possibility to parallelize modifying queries that have a
RETURNING id
clause. For those, return the resulting ids (in a defined order) instead of the affected row count.To avoid misuse add a warning to the docstring and try to detect queries other than the ones of the intended form. Raise an error if such are found.