Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for reading saved and cached query results #212

Merged
merged 14 commits into from
May 15, 2017

Conversation

baumatron
Copy link
Contributor

This addresses #181.

  • Adds runAsyncSavedAnalysis:completionHandler to KeenClient
  • Works for single-analysis, multi-analysis, and funnel
  • Adds unit tests for the feature
  • Added .clang-format which can be used to auto-format source. I'll make a pass on the rest of the SDK using this in a future PR.
  • Splits up the monolithic KeenClientTests class
  • Fixes several non-deterministic unit test issues and crashes
  • Fix an issue with KIODBStore. openDB wasn't thread-safe but could be called from different threads under certain circumstances.
  • Removed remnants of KIOEventStore

baumatron added 14 commits May 4, 2017 16:13
Also some minor refactoring of tests with an eventual goal of breaking up the KeenClientTests.m monolith.

Saved query method is lacking 4xx client error throttling logic that exists in the regular query method. Will be added next along with tests.
Make opening the database threadsafe, as that can actually happen from the thread calling into the SDK or the SDK's database operation queue.

Make deletion of event and query data synchronous so as to make subsequent database operations deterministic.

XCTestCases can run in parallel, and the fact that tests are using a full on sqlite store mean that deletion of that store needs to be synchronized, so do that with TestDatabaseRequirement.

Fix a few unit tests that weren't waiting for asynchronous work to complete before finishing.

Remove a few old references to KIOEventStore.
@coveralls
Copy link

Coverage Status

Coverage increased (+0.3%) to 67.132% when pulling 9b948a9 on saved_cached_queries into f607ea9 on vnext_minor.

@@ -50,17 +54,27 @@ @implementation KIODBStore {
- (instancetype)init {
self = [super init];

if(self) {
if (nil != self) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below for the openLock nil check, in general is the guidance that the explicit nil is necessary or is it OK to assume it's false-y? PR #211 seemed to have examples of both:

Favoring false-y nil:

 -    if ([HTTPCodes httpCodeType:(responseCode)] == HTTPCode4XXClientError && query != nil) {
 +    if (query && [HTTPCodes httpCodeType:(responseCode)] == HTTPCode4XXClientError) {
          // log what happened

...and favoring explicit nil check:

 -    if (![KIOQuery validateQueryType:queryType]) {
 +    if (queryType == nil || queryType.length <= 0) {
          return nil;
      }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that it's mostly a stylistic preference in ObjC land, and for whatever reason doesn't seem consistent in code I've seen. We can decide to formalize a style if you'd like.

The init code snippet from Xcode looks like this:

- (instancetype)init
{
    self = [super init];
    if (self) {
        <#statements#>
    }
    return self;
}

I generally like to do a direct comparison in C-like languages because I think it makes the intent clearer, but seems that isn't generally the case in ObjC?

Regarding the actual logic, nil is actually just a preprocessor definition to ((void*)0), and of course 0 is always coerced to false in C, so there aren't any gotchas there.

Logically it's equivalent, so we're splitting hairs about readability here, and that can largely be subjective. I'd vote for generally comparing to nil, but I know we aren't consistently doing that across the project, and other contributors seem to have disagreed with that. All that said, I don't think I'm going to be spending much time making sure we're consistent with this except for cases where it's definitely not clear the intention.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought it was the same. I too prefer explicit checks in C-ish languages, but some people who are used to JS or Python have a conniption seeing that. Whatever you choose is fine with me as long as we just be consistent going forward.

// The database can be opened from the thread calling into the SDK,
// or by the database queue thread if it has been closed, so open
// needs to be protected with a lock.
[openLock lock];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we gotten to a point where we expect this to be happening infrequently? If not, should we consider at some point a less resource-intensive? Or is NSLock already fairly nice about trying in user mode first before doing anything too drastic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NSLock uses a normal POSIX mutex under the hood. Another alternative could be a spin lock if we decide/find there will be high contention, or an atomic swap and compare could guard the mutex lock.

Most calls to KIODBStore methods seem to make a single call to checkOpenDB before dispatching a block to the queue, while the queue sometimes calls a helper that will call checkOpenDB. I would expect that taking the lock wouldn't really be a huge bottleneck unless there were a pattern of contention or a massive volume of events, which doesn't seem likely either.

For example, calling addEvent: 1000 times will take and free the lock 1000 times, and the dispatched block doesn't take a lock. This says an atomic operation will be about 4x faster than taking a mutex. That documentation gives some timing examples for pretty old hardware, but maybe it could compare to a modern phone. Taking a mutex would take 0.2 us on the hardware specified vs 0.05 us for an atomic swap. So let's say that means locking overhead is 1000 * 0.5 us * 2 = 1000 us = 1 ms, possibly with a longer wait in there if there are a few operations already queued up that actually take the lock thus causing contention. In terms of perf, I'd be more concerned about, say, writing those events to disk.

Let me know if you think that's a reasonable take on the situation or think a spin lock or atomic swap is worth it (could do an atomic compare and only take the lock if the db isn't already open).

In general though, KIODBStore is kind of a can of worms. If time allows, I would rip things up and ensure no lock is needed, and that we're not closing the database unless it needs to be completely torched, thus not needing to check if it's open in order to re-open it.

There's kind of a pile of stuff surrounding the data store that should be dealt with at some point, ideally it would be for vnext_major: #78, #170, and #183 kind of all go in that bucket. I'd say the work would include #78, and cleaning up error handling in KIODBStore to ensure it isn't closed unless the store is going to be erased, and ensuring opens only happen from a single thread, likely the dispatch queue that handles db operations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought we were eventually going to move away from this call happening so often. If that's the goal, and it seems realistic, then I wouldn't worry about optimizing here yet. Let's say all these issues get cleared up and we're still taking this lock thousands of times during normal course of operations...maybe then let's revisit turning this into a user-mode operation when not under contention (whether via atomics or slimmer locks or whatever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good.

@masojus
Copy link
Collaborator

masojus commented May 16, 2017

This looks pretty good to me. Thanks @baumatron!

@baumatron baumatron deleted the saved_cached_queries branch May 16, 2017 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants