Skip to content

Commit

Permalink
added places/categories
Browse files Browse the repository at this point in the history
  • Loading branch information
AdenForshaw committed Oct 31, 2024
1 parent 53f47a3 commit 044d137
Show file tree
Hide file tree
Showing 10 changed files with 154 additions and 61 deletions.
7 changes: 4 additions & 3 deletions .env.sample
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
BIGQUERY_PROJECT_ID=landmarks-test
GCS_BUCKET_NAME=overture-query-cache
GOOGLE_APPLICATION_CREDENTIALS=gcp-credentials.json
BIGQUERY_PROJECT_ID=thatapiplatform
GCS_BUCKET_NAME=overture-maps-query-cache
AUTH_API_ACCESS_KEY=create-one-from-theauthapi.com
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,5 @@ pids
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json

.env
credentials.json
credentials.json
gcp-credentials.json
48 changes: 31 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,32 +5,46 @@
## Endpoints

- [./Places](https://docs.overturemaps.org/guides/places/) - The Overture places theme has one feature type, called place, and contains more than 53 million point representations of real-world entities: schools, businesses, hospitals, religious organizations, landmarks, mountain peaks, and much more.
- [./Addressess](https://docs.overturemaps.org/guides/addresses/) - An address is a feature type that represents a physical place through a series of attributes: street number, street name, unit, address_levels, postalcode and/or country. They also have a Point geometry, which provides an approximate location of the position most commonly associated with the feature.
- [./Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page) - Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. It is used to match the brand.wikidata with the wikidata_id of the place.
- [./Places/Brands]
- [./Places/Categories]
- [./Places/Countries]

### Schemas & Design

### Schemas
- [API Design](./docs/api-design.md)
- [Place](https://docs.overturemaps.org/schema/reference/places/place/)
- [Address](https://docs.overturemaps.org/schema/reference/addresses/address/)


### BigQuery
- [Place](https://console.cloud.google.com/bigquery?project=bigquery-public-data&p=bigquery-public-data&d=overture_maps&t=place&page=table)

Example Query
```SQL
SELECT *
FROM `bigquery-public-data.overture_maps.place`
WHERE ST_DWithin(geometry, ST_GeogPoint(16.3738, 48.2082), 500)
```

### Extras

- [Overture Maps](https://overturemaps.org/)
- [Overture Maps API](https://docs.overturemaps.org/)


### Data patching

- Wikidata ID - is not always availble in the Overture Maps data. We can use the Wikidata API to get the wikidata_id for the place with a name and country match for best quess. This can be disabled in the request parameters via `patch_wikidata=false`.

### Deployment
- [Google Cloud Platform](./docs/google-cloud-platform.md)
### Deployment & Datasets

- [Google Cloud Platform](./docs/google-cloud-platform.md)


### API Key management

You can either use the hardcoded API key in the code, or use the Auth API by going to theAuthAPI.com and creating an account. You can then create an Access Key for the App and add it as an Env var, and then create any number of API Keys for secure access to the API, and rate-limit them for cost control.


### Running Locally

GCP: Download the Service Account .json file, and set the name in the .env variable `GOOGLE_APPLICATION_CREDENTIALS` to the path of the file.

```bash
npm install
npm run start
```

Test the API by curl on `http://localhost:8080/places/countries` with the DEMO-API-KEY

```bash
curl -H "x-api-key: DEMO-API-KEY" -X GET -G 'http://localhost:8080/places/brands' -d 'country=AU'
```
15 changes: 15 additions & 0 deletions docs/api-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Design Principles

## API Design

Response objects should be as close to the Overture Schema as possible, and use the `ext_` prefix for any additional fields. This allows us to easily map the response to the schema, and also allows us to add additional fields without breaking the schema.

Request parameters should use the Overture fields for reference, with underscore separators for filtering by nested fields. For example, `brand_wikidata` for filtering by `brand.wikidata`.

## Security

OWASP Top 10 security risks should be considered when designing the API

## Cost control

Rate limiting and caching should be used to control costs. We can use the free tier for Cloud Run and Cloud Storage to cache the data for faster response times. In production you should consider using Redis instead of Cloud storage for caching, and migrating the parts of the dataset you need to a private BigQuery dataset or a different database for speed and cost, especially for building shapes
45 changes: 41 additions & 4 deletions docs/google-cloud-platform.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,52 @@

Google hosts the Overture Maps dataset in it's public BigQuery dataset. This allows us to deploy the API to Google Cloud Platform (GCP) and take advantage of the free tier for hosting and querying the data. The $400 free credit is more than enough to get started and test the API. We can also use the free tier for Cloud Storage to cache the data for faster response times.

We can use the Cloud Run service to deploy the API. This allows us to scale the API based on demand and only pay for the resources we use. We can also use the Cloud Build service to automate the deployment process.
## Architecture

In production you should consider using Redis instead of Cloud storage for caching, and migrating the parts of the dataset you need to a private BigQuery dataset or a different database for speed and cost.
We can use the Cloud Run service to deploy the API. This allows us to scale the API based on demand and only pay for the resources we use. We can also use the Cloud Build service to automate the deployment process.

In production you should consider using Redis instead of Cloud storage for caching, and migrating the parts of the dataset you need to a private BigQuery dataset or a different database for speed and cost, especially for building shapes

## Setup

In this guide we will cover the following steps:

- Create GCP Account with free credit
- Authenticate with GCP
- Create GCS bucket for cache
- Deploy to Cloudrun via Cloudbuild
- Apply Env Variables
- Fork github repo
- Setup a Service Account with the right permissions
- Deploy to Cloudrun by connecting to your github repo, and apply env vars, and have it use the service-account

## API Key management

You can either use the hardcoded API key in the code, or use the Auth API by going to theAuthAPI.com and creating an account. You can then create an Access Key for the App and add it as an Env var, and then create any number of API Keys for secure access to the API, and rate-limit them for cost control.

## Datasets

### BigQuery

- [Place](https://console.cloud.google.com/bigquery?project=bigquery-public-data&p=bigquery-public-data&d=overture_maps&t=place&page=table)

Example Query

```SQL
SELECT *
FROM `bigquery-public-data.overture_maps.place`
WHERE ST_DWithin(geometry, ST_GeogPoint(16.3738, 48.2082), 500)
```

### Service Account roles

For a service account in GCP that a Cloud Run instance will use to access BigQuery and Google Cloud Storage (GCS), you’ll need to grant it specific roles to ensure it has permissions to create and run BigQuery jobs, as well as read and write files in a GCS bucket. Here are the recommended roles:

BigQuery Permissions:

- BigQuery User (roles/bigquery.user): Grants permissions to create and run jobs in BigQuery.
- BigQuery Data Viewer (roles/bigquery.dataViewer): Allows the service account to view datasets and tables, if it needs access to view data.
- (Optional) BigQuery Job User (roles/bigquery.jobUser): This role can also be helpful if your queries require advanced job control features, though usually, bigquery.user suffices.

Google Cloud Storage (GCS) Permissions:

- Storage Object Viewer (roles/storage.objectViewer): Grants read access to objects in the bucket.
- Storage Object Creator (roles/storage.objectCreator): Grants permission to write files to the bucket, including creating new objects.
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "overture-maps-api",
"version": "0.0.1",
"version": "0.0.2",
"description": "",
"author": "",
"private": true,
Expand Down
54 changes: 39 additions & 15 deletions src/bigquery/bigquery.service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ export class BigQueryService {
});
}

private parseRow(row: any): Place {
private parsePlaceRow(row: any): Place {
return {
id: row.id,
geometry: {
Expand Down Expand Up @@ -101,12 +101,13 @@ export class BigQueryService {
longitude?: number,
radius: number = 1000,
categories?: string[],
minimum_places: number = 10,
minimum_places?: number,
require_wikidata: boolean = false

): Promise<{ brand: string; wikidata: string; counts:{ places:number} }[]> {
): Promise<{ names: {primary:string,common:string,rules:string}; wikidata: string; counts:{ places:number} }[]> {

let query = `-- Overture Maps API: Get brands nearby
SELECT DISTINCT brand.names.primary AS brand, brand.wikidata AS wikidata, count(id) as count_places
SELECT DISTINCT brand , count(id) as count_places
FROM \`bigquery-public-data.overture_maps.place\`
`;

Expand All @@ -125,22 +126,17 @@ export class BigQueryService {
if (require_wikidata) {
query += ` AND brand.wikidata IS NOT NULL`;
}
query += ` GROUP BY brand, wikidata`;
query += ` GROUP BY ALL`;
if (minimum_places) {
query += ` HAVING count_places >= ${minimum_places}`;
}
query += ` ORDER BY count_places DESC;`;

const options = {
query: query,
location: 'US', // Adjust the location if necessary
};

const [rows] = await this.bigQueryClient.query(options);
const {rows} = await this.runQuery(query);

return rows.map((row: any) => ({
brand: row.brand,
wikidata: row.wikidata,
names: row.brand.names,
wikidata: row.brand.wikidata,
counts:{
places: row.count_places
}
Expand All @@ -165,6 +161,34 @@ export class BigQueryService {
}
}));
}

async getCategories(country?:string): Promise<{ primary: string; counts:{ places:number } }[]> {
let query = `-- Overture Maps API: Get categories
SELECT DISTINCT categories.primary AS category_primary,
count(1) as count_places,
count(distinct brand.names.primary) as count_brands
FROM \`bigquery-public-data.overture_maps.place\`
WHERE categories.primary IS NOT NULL
`;
if (country) {
query += ` AND addresses.list[OFFSET(0)].element.country = "${country}"`
}

query += ` GROUP BY category_primary
ORDER BY count_places DESC;
`;

const {rows} = await this.runQuery(query);

return rows.map((row: any) => ({
primary: row.category_primary,
counts:{
places: row.count_places,
brands: row.count_brands
}
}));
}

async getPlacesNearby(
latitude: number,
longitude: number,
Expand Down Expand Up @@ -231,12 +255,12 @@ export class BigQueryService {
queryParts.push(`LIMIT ${this.applyMaxLimit(limit)}`);
}

// Finalize the query stbilledAmountInnGBring
// Finalize the query
const query = queryParts.join(' ') + ';';
this.logger.debug(`Running query: ${query}`);

const { rows } = await this.runQuery(query);
return rows.map((row: any) => this.parseRow(row));
return rows.map((row: any) => this.parsePlaceRow(row));
}

applyMaxLimit(limit: number): number {
Expand Down
2 changes: 1 addition & 1 deletion src/middleware/auth-api.middleware.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ export class AuthAPIMiddleware implements NestMiddleware {
private theAuthAPI: TheAuthAPI;

constructor() {
if(process.env.AUTH_API_ACCESS_KEY)this.theAuthAPI = new TheAuthAPI(process.env.AUTH_API_ACCESS_KEY);
if(process.env.AUTH_API_ACCESS_KEY && process.env.AUTH_API_ACCESS_KEY!="create-one-from-theauthapi.com")this.theAuthAPI = new TheAuthAPI(process.env.AUTH_API_ACCESS_KEY);
}

async use(req: Request, res: Response, next: () => void) {
Expand Down
14 changes: 0 additions & 14 deletions src/places/dto/get-categories.dto.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,4 @@ export class GetCategoriesDto {
@MinLength(2)
country?: string; // ISO 3166 country code

@ValidateIf(o => !o.country)
@IsNumber()
lat?: number;

@ValidateIf(o => !o.country)
@IsNumber()
lng?: number;

@ValidateIf(o => !o.country)
@IsOptional()
@IsNumber()
@Min(1)
radius?: number = 1000; // Default radius is 1000 meters if not provided

}
25 changes: 20 additions & 5 deletions src/places/places.controller.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import { GetPlacesDto } from './dto/get-places.dto';
import { PlaceResponseDto } from './dto/place-response.dto';
import { GetBrandsDto } from './dto/get-brands.dto';
import { IsAuthenticatedGuard } from '../guards/is-authenticated.guard';
import { GetCategoriesDto } from './dto/get-categories.dto';

@Controller('places')
@UseGuards(IsAuthenticatedGuard)
Expand Down Expand Up @@ -48,11 +49,11 @@ export class PlacesController {

const cacheKey = `get-places-brands-${JSON.stringify(query)}`;

// Check if cached results exist in GCS
const cachedResult = await this.gcsService.getJSON(cacheKey);
if (cachedResult) {
return cachedResult;
}
// Check if cached results exist in GCS
const cachedResult = await this.gcsService.getJSON(cacheKey);
if (cachedResult) {
return cachedResult;
}

const brands = await this.bigQueryService.getBrandsNearby(country, lat, lng, radius, categories);

Expand All @@ -67,6 +68,20 @@ export class PlacesController {
}

const brands = await this.bigQueryService.getPlaceCountsByCountry();

return brands;
}

@Get('categories')
async getCategories(@Query() query: GetCategoriesDto) {

const cacheKey = `get-places-categories`;
const cachedResult = await this.gcsService.getJSON(cacheKey);
if (cachedResult) {
return cachedResult;
}

const brands = await this.bigQueryService.getCategories(query.country);

return brands;
}
Expand Down

0 comments on commit 044d137

Please sign in to comment.