This intro level workshop will help you get started with ThirdEye Anomaly Detection Platform on an existing Pinot Cluster. To complete this workshop, you should have a StarTree Cloud account.
In this workshop, you will learn how to:
- Access and navigate the ThirdEye portal
- Set up a data source and ingest data
- Create and alert
- Examine anomalies
- Create events
To be effective in this workshop, you should:
- Have some Apache Pinot experience or have finished the previous workshop
- Have a Startree Cloud account sign up here
In this workshop, we'll complete four challenges. Each challenge is designed to get you familiar with the platform and introduce you to the basic functionality of the ThirdEye platform.
Log into the Startree Cloud Portal by navigating to https://startree.cloud.
Note: If you are doing this workshop with a StarTree host, they will share the URL to log on with you. If you are signed up for StarTree Cloud, use that URL, and make sure you have ThirdEye enabled.
You should see the following:
Note: If you don't see the ThirdEye link active, you may need to contact StarTree support.
Click ThirdEye.
You should see something like this:
Now that we have access to the ThridEye portal, we'll set up our data for ThirdEye.
Click Configuration in the menu.
Select the Create button, and then select the Onboard Dataset option.
For this challenge, we'll select the existing Apache Pinot source, called pinot.
Then, we'll select a dataset available to us. For this challenge, let's select websiteAnomalies.
Note: If you're using a shared environment, use one of the websiteAnomalies(x).
Now, select Submit.
That's it! You've configured a datasource for anomaly detection with ThirdEye.
Next, we will create an alert. Let's start by selecting Alert from the left menu.
This will take you to a page where you can select one a type of alert to create.
You can choose from the following alert types:
- Basic alert
- Multi-dimensional alert
Dimension exploration gives you the ability to create alerts for every value of a certain dimension or combination of dimensions.
Here's the visual:
For our porposes, we will choose the Multi-dimensional alert. Next, select the CoHort recommender.
At this point, you should be able to create an alert using the dataset you onboarded.
Here's what we will be using for each of the values:
- Dataset: The dataset you onboarded
- Metric: Let's use click
- Aggregation function: Let's use SUM
- Dimensions: Let's use country, browser and platform
- Date range: Let's narrow it down to 2021
- Query filter: Ignore for now
- Contribution percentage: Let's leave it at 5%.
Let's select Generate Dimensions to Monitor to see what your dimensions look like.
You should see something like this:
Now, let's select all of the dimensions, and click Create Multi-dimension alert.
You should now be able to select the granularity. The default is Daily, but you can change it. Load the chart to see what the pattern looks like. You can also change the date range to see what shows up.
Choose SUM as the Aggregation function, and hourly as the granularity, reload preview, and scroll down to country area. You'll notice some spikes.
For our alert, I am going to choose avg daily. Select Next to move to the algorithm selection.
I am going to select the StarTree-ETS Rule.
Select the Next button, which takes you to the Tune alert page. Here, you can adjust the sensivity, seasonality, and lookback period. You can load chart to see the alerts that show up based on your tuning.
For our scenario, we will pick high sensitivity & the rest default values.
Select Next to move to the anomaly filter page. Here, you can select the Filters and sensitivity button to adjust your anomaly detection. Examples are adding exception, such as don't monitor on weekends etc.
For this scenario, I am not going to set up any filter. Select Next, provide a meaningful name, and save.
Voila! You have created your first alert!
Now that we have created an alert, let's dive into how to look at your anomalies.
From the Alert menu, select the alert you just created.
You should see somthing like this:
You can click on the View Details link to see the anomalies for a particular set of dimensions:
You can click on individual anomalies, or select a section to view it in detail.
Let's click on an anomaly.
Select the Investigate Anomaly button to do some root cause analysis.
Here, you can look at the heatmap, the top contributors, and events. We will talk about events in the next exercise.
The heatmaps show you what contributing fators were related to the anomaly. Blue indicates higher than normal numbers, red indicates lower than normal numbers, and grey is the baseline.
Top contributors shows you what caused the anomaly.
You also have the option to tag this as "real" or false positive.
From the Is this an anomaly drop-down list, choose an option that fits your scenario.
You can save the investigation, revisit later, or share it with someone on your team.
Now let's create an event.
Navigate to the configuration menu, and select the Events tab.
Select Create, and then Create Event.
Enter a name, type (string), and start and end dates.
Optionally, you can add some metadata tags. These can be used for catagorization purposes. Click Create event to save.
To use a created event, on the Anomalies page, choose the Events tab, and associate the event to the anomaly.
Let's now create subscription groups and associate some alert with them.
From the Configuration menu, select Subscription Groups. Select Create Subscription Group.
Enter the name and schedule. You can either use cron syntax or use the UI to set up the schedule.
You can also add emails, slack channels or use a webhook to connect to the subscription group.
Select the Next button to set up the alerts you want to be notified on. Click Save to save your subscription group.
When you create new alerts, you can use the existing subscription groups to add the alert notification to.
Voila! You should now recieve alerts when anomalies happen!
From the Alerts menu, select and delete any alerts you have created. Fom the Configuration menu, select and delete:
- Subscription Groups
- Events
- Datasets