PyFulcrum Web application is a two-component application. One component is webhook receiver that can be configured in Fulcrum to receive updates on specific resources, second is API endpoints from PyFulcrum library exposed through HTTP (only fragment of the whole API), also with media files available. Both parts are available as Flask Blueprints objects, so they can be incorporated into bigger application if needed.
- PyFulcrum lib and it's requirements
- Flask
- web server to serve statics and web application (nginx + uwsgi are preferred)
PyFulcrum web application requires working installation of PyFulcrum lib. Following steps assume deployment will be using nginx and uwsgi packages.
-
Pefrorm steps from PyFulcrum lib installation, using
/usr/local/pyfulcrum
directory as a work directory and/usr/local/pyfulcrum/htdocs/storage
as storage directory. We will reuse database and virtualenv from that installation. -
Install PyFulcrum web application requirements and application itself
venv/bin/pip install -r repo/pyfulcrum/web/requirements.txt venv/bin/pip install -e repo/pyfulcrum/web/
-
Install uwsgi and nginx packages:
Debian-derived:
apt-get install uwsgi-plugin-python3 nginx
Or RedHat/CentOS:
yum install nginx uwsgi-plugin-python3
-
Configure web application. Web application requires configuration file to be present and to contain location of storage, Fulcrum API key and database url. See Web application configuration for details.
-
Configure uwsgi. UWSGI requires app configuration file. Sample file is in
repo/pyfulcrum/web/files/pyfulcrum-web.ini
. You need to adjust paths, socket location and plugin name if needed. Target location depends on your operating system, but usually Debian-derived systems use/etc/uwsgi/apps-enabled/
directory to store application configuration, while RH-based use/etc/uwsgi.d/
for the same purpose. Note that Debian-derived systems usually enforce socket path for uwsgi applications. -
Configure nginx. Nginx requires vhost configuration file. Sample file is in
repo/pyfulcrum/nginx.site.conf
. You need to adjustserver_name
,root
and upstream sockets locations to match your deployment. Note that API application doesn't require any authentication, so you may want to enforce user authentication in web server for this purpose.Target location depends on your operating system, but usually Debian-derived systems use
/etc/nginx/sites-enabled/
directory to store vhost configuration, while RH-based use/etc/nginx/conf.d
for the same purpose. -
Restart uwsgi and nginx services:
service uwsgi restart service nginx restart
Application deployed in this way can be updated in following way:
-
Go to repository directory and update code base:
cd repo/pyfulcrum git pull
-
Run migrations if needed:
cd repo/pyfulcrum/lib ../../venv/bin/alembic -c local-db.ini upgrade head
-
Update web application
main.py
file to enforce reload of uwsgi, so it will use new code:touch repo/pyfulcrum/web/src/pyfulcrum/web/main.py
PyFulcrum webapp blueprints require configuration data, which provides database url, path to storage/storage base url and Fulcrum API key. Due to a fact that components are implemented as blueprints, they expect configuration will be provided by application that incorporates them as blueprints. In any case, each blueprint requires specific keys to be present in current_app.config
mapping.
By default (using pyfulcrum.web.main
module) Flask application expects configuration to be provided as a file with key=value format, which will be parsed on application load. Each blueprint is indenpendent, so configuration may seem to be repeated. This is however to provide more flexibility in deployment and application composition.
Sample configuration is available in web/files/config.sample.cfg
.
Webhook blueprint allows to handle multiple webhooks. Each webhook has a distinct name, and can point to different location. Name can be any set of chars, and for security reasons (Fulcrum application doesn't provide any means to authenticate webhook call from their side, other than using unique token in webhook URL), it's recommended to use random string. Configuration expects name to be present in webhook config key, so keys are constructed using following template (we'll use $NAME
as a placeholder for actual name):
WEBHOOK_$NAME_DB="postgresql://pyfulcrum:pyfulcrum@localhost/pyfulcrum"
WEBHOOK_$NAME_CLIENT="XXXxxxXXXxxx"
WEBHOOK_$NAME_STORAGE="/path/to/storage;http://server/storage/"
WEBHOOK_$NAME_DB
is database url, similar to one provided in PyFulcrum lib.WEBHOOK_$NAME_CLIENT
is Fulcrum API keyWEBHOOK_$NAME_STORAGE
is a string containing path to storage and optional base url which is used to serve storage files via HTTP server. Both values are separated with;
sign.
Internally, webhook receives $NAME
as call parameter, then it retrives per-webhook configuration using pseudo-namespace constructed with WEBHOOK_$NAME_
string. If there's no configuration for given namespace, webhook call will be considered invalid. If configuration exists, it will be processed.
Webhook configuration allows to provide list of objects (for example, forms) that should be only/should not be processed by webhook. This is configurable with two config variables, WEBHOOK_$NAME_INCLUDE_OBJECTS
and WEBHOOK_$NAME_EXCLUDE_OBJECTS
. Each value in variable should have a a form resource_type:id1,id2,..;resource_type:id1,id2
. For example, to handle updates on two specific forms only, you should set variable to
WEBHOOK_$NAME_INCLUDE_OBJECTS="form:$form1_id,$form2_id"
where $form1_id
and $form2_id
are identifiers of those forms. This will also work for related resources, like record or media files. They will be processed, only if form they're referencing is one of those forms.
To ignore specific forms, you should add similar configuration to WEBHOOK_$NAME_EXCLUDE_OBJECTS
.
If both variables are provided in configuration, resource will be checked for include list first (so whitelist check will be performed first). If whitelist is empty, blacklist mode is performed.
API blueprint requires similar configuration paris as webhook, although only one API instance is created, so only one configuration key-value set is needed.
API_DB="postgresql://pyfulcrum:pyfulcrum@localhost/pyfulcrum"
API_CLIENT="XXXxxxXXXxxx"
API_STORAGE="/path/to/storage;http://server/storage/"
Webhook receiver allows to synchronize data in your Fulcrum account with local copy. Mind that your Fulcrum plan must enable webhooks (see developer documentation for details.). Follow instructions from Fulcrum to set up webhook there.
As stated above, there can be several webhook receivers with different configuration. Base webhook url is /webhook/$name/
, where $name
is a varying part, and corresponds to $NAME
part in webhook configuration. So, for example, if you have following configuration:
WEBHOOK_MYWEBHOOK1_DB="postgresql://pyfulcrum:pyfulcrum@localhost/pyfulcrum"
WEBHOOK_MYWEBHOOK1_CLIENT="XXXxxxXXXxxx"
WEBHOOK_MYWEBHOOK1_STORAGE="/path/to/storage;http://my.web.server/storage/"
and web application is deployed under http://my.web.server
, then your webhook url will be http://my.web.server/webhook/mywebhook/
.
When webhook is received (see webhooks documentation for details on event types and sample payloads), it's payload is parsed, and two values are extracted: event type (from type
path) and item id (from data.id
path). Internally, webhook receiver will call PyFulcrum API client to fetch fresh payload for given item id. Due to limited security around webhook dispatching, payload from webhook is not considered as secure, and so the only reasonable way to validate payload is to fetch object pointed in payload directly from Fulcrum API.
API web application is fairly simple, it's exposes PyFulcrum's library resources list method (pyfulcrum.lib.api.BaseResource.list
). API will return ONLY CACHED values, no live requests are made to Fulcrum API. If configured properly, media files links should be served by web server as well.
Following endpoints are available:
/api/forms/
- list of forms/api/records/
- list of records/api/projects/
- list of projects/api/photos/
- list of photos/api/videos/
- list of videos/api/audio/
- list of audio/api/signatures/
- list of signatures
Each endpoint can be served in varions formats. Format can be controlled by setting format
query param. By default, json
format is used. Data can be retrived in several formats:
Format | Description | Returns only spatial-enabled items |
---|---|---|
raw |
Raw Fulcrum API payload for objects stored | No |
json |
PyFulcrum-flavor of JSON for objects stored. This will contain processed media links. | No |
csv |
Returns CSV with all objects for resource, and it doesn't support paging. | No |
geojson |
This will return GeoJSON FeatureCollection with records that have proper spatial location set. Note, this will work only for Records. |
Yes |
kml |
This will return KML format with records that have proper spatial location set. Note, this will work only for Records, and it doesn't support paging. |
Yes |
shp |
this will return ESRI Shapefile format with records that have proper spatial location set. Note, this will work only for Records, and it doesn't support paging. |
Yes |
Summary of supported formats per resource type
Resource type | URL | formats | Spatial-aware | allowed filtering args |
---|---|---|---|---|
Forms | /api/forms/ |
raw , json , csv |
No | form_id |
Records | /api/records/ |
raw , json , geojson , kml , shp |
Yes | form_id , record_id , created_since , created_before , updated_since , updated_before |
Projects | /api/projects/ |
raw , json , csv |
No | - |
Photos | /api/photos/ |
raw , json , csv |
No | record_id , form_id |
Audio | /api/audio/ |
raw , json , csv |
No | record_id , form_id |
Videos | /api/videos/ |
raw , json , csv |
No | record_id , form_id |
Signatures | /api/signatures/ |
raw , json , csv |
No | record_id , form_id |
Additionally, each endpoint supports (excluding various exceptions) paging with following query params:
page
- 0-based page number index (default:0
)per_page
- number of items per page. Default is50
.
Response, if any json is used, usually comes within following envelope:
{"page": 0,
"per_page": 50,
"total_pages": 123,
"total": 6150,
"items": []}
With exception for GeoJSON
, which will return items as features
key.
- retrive list of forms as Fulcrum API payload:
GET http://your.server/api/forms?format=raw
- retrive list of records as Fulcrum API payload for given form:
GET http://your.server/api/records?format=raw&form_id=xxxxXXXxxxx
- retrive list of records as Fulcrum API payload for given form, 10th page (indexed from 0):
GET http://your.server/api/records?format=raw&form_id=xxxxXXXxxxx&page=9
- retrive list of records as shapefile for given form. Because this list response is paged, this will produce shapefile with 50 features on it.:
GET http://your.server/api/records?format=shp&form_id=xxxxXXXxxxx
- retrive list of at most 500 records as shapefile for given form. You can generate another 500 records by modifying
page
parameter:
GET http://your.server/api/records?format=shp&form_id=xxxxXXXxxxx&per_page=500&page=0
Note that spatial formats, especially shapefile and kml, take significant time to create. If number of records is too big, server may return timeout (http 502 error) instead.
- Retrive specific record from records list:
GET http://your.server/api/records/?format=json&form_id=xxxxXXXxxxxx&record_id=yyyyYYYyyyyy
Webhook implementation is synchronous. At the moment, it’s fast enough to work in that mode. However, if in the future this solution would turn out to be underperformant, part of webhook handling logic can be moved to task queue. Flask has quite good integration with Celery and RQ. In order to integrate existing code, pyfulcrum.web.webhooks._handle_webhook()
should be decorated with proper task decorator, and called from pyfulcrum.web.webhooks.fulcrum_call
as asynchronous task. This requires handling task queue configuration in web application that is calling webhook blueprint, and webhook code should be updated as well. Also, Redis would be needed as additional deployment component.