-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#957 Implement earliest
and latest
functions
#1018
Changes from 4 commits
b122f78
d7ce254
9945953
e718b51
282f3f6
4fef581
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -247,10 +247,6 @@ FIRST: 'FIRST'; | |
LAST: 'LAST'; | ||
LIST: 'LIST'; | ||
VALUES: 'VALUES'; | ||
EARLIEST: 'EARLIEST'; | ||
EARLIEST_TIME: 'EARLIEST_TIME'; | ||
LATEST: 'LATEST'; | ||
LATEST_TIME: 'LATEST_TIME'; | ||
Comment on lines
-250
to
-253
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These aggregation commands don't appear to have ever been implemented. They are not used, referenced, or documented anywhere outside the auto-generated Antlr modules, that I can find. Moreover, I think that their functionality (as best as I can guess from the name, since they aren't documented) can just be accomplished using the existing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. are there spark issues that refer to this syntax? if so should mention that they are no longer relevant or does sql has these implemented? If so might want to create future issues to implement this as we are striving for feature parity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Good idea. I searched and didn't find anything.
OpenSearch SQL does not appear to have these implemented either, and I similarly wasn't able to find any GitHub issues that made mentioned of it. @YANG-DB I assume it also makes sense to remove this syntax from OpenSearch SQL for parity? Should I raise an issue and create a (tiny) PR for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update: here is the pull request. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes I agree - thanks for creating a PR for parity in the SQL repo 🙏 |
||
PER_DAY: 'PER_DAY'; | ||
PER_HOUR: 'PER_HOUR'; | ||
PER_MINUTE: 'PER_MINUTE'; | ||
|
@@ -338,7 +334,6 @@ MONTHNAME: 'MONTHNAME'; | |
NOW: 'NOW'; | ||
PERIOD_ADD: 'PERIOD_ADD'; | ||
PERIOD_DIFF: 'PERIOD_DIFF'; | ||
RELATIVE_TIMESTAMP: 'RELATIVE_TIMESTAMP'; | ||
SEC_TO_TIME: 'SEC_TO_TIME'; | ||
STR_TO_DATE: 'STR_TO_DATE'; | ||
SUBDATE: 'SUBDATE'; | ||
|
@@ -360,6 +355,11 @@ UTC_TIMESTAMP: 'UTC_TIMESTAMP'; | |
WEEKDAY: 'WEEKDAY'; | ||
YEARWEEK: 'YEARWEEK'; | ||
|
||
// RELATIVE TIME FUNCTIONS | ||
RELATIVE_TIMESTAMP: 'RELATIVE_TIMESTAMP'; | ||
EARLIEST: 'EARLIEST'; | ||
LATEST: 'LATEST'; | ||
|
||
// TEXT FUNCTIONS | ||
SUBSTR: 'SUBSTR'; | ||
SUBSTRING: 'SUBSTRING'; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the case where earliest is interpreted as
timestamp >= now() - 1s
, depend on when the now() is resolved, it may produce insonsistent result?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe so. Relative timestamp are based on calls to
CurrentTimestamp
(seeorg/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
). If you look at the documentation forCurrentTimestamp
, it says that:So
now()
,relative_timestamp("now")
,earliest("now", timestamp)
,latest("now", timestamp)
andcurrent_timestamp
should all return consistent results within the same query. I tested this out manually as well to verify (i.e. repeated calls to these methods within the same query return the samenow
timestamp, down to the millisecond).