Providing convenience APIs
For spatial datasets it is of interest to share them via convenience APIs, so the datasets can be downloaded in parts or easily be visualised in common tools such as QGIS, OpenLayers & Leaflet. The standards on data services of the Open Geospatial Consortium are designed with this purpose. These APIs give direct access to subsets or map visualisations of a dataset.
In this paragraph you will be introduced to various standardised APIs, after which we introduce an approach to publish datasets, which builds on the data management approach introduced in the previous paragraphs.
Standardised data APIs
Standardised mapping APIs, such as Web Map Service (WMS), Web Feature service (WFS) and Web Coverage Service (WCS), originate from the beginning of this century. In recent years several challenges have been identified around these standards, which led to a series of Spatial data on the web best practices. OGC then initiated a new generation of standards based on these best practices.
An overview of both generations:
OWS | OGC-API | Description |
---|---|---|
Web Map Service (WMS) | Maps | Provides a visualisation of a subset of the data |
Web Feature Service (WFS) | Features | API to request a subset of the vector features |
Web Coverage Service (WCS) | Coverages | API to interact with grid sources |
Sensor Observation Service (SOS) | Sensorthings | Retrieve subsets of sensor observations |
Web Processing Service (WPS) | OGCAPI:Processes | Run processes on data ] |
Catalogue Service for the web (CSW) | OGCAPI:Records | Retrieve and filter catalogue records |
Notice that most of the mapping software supports the standards of both generations. However, due to their recent introduction, expect incidental challenges in the implementations of OGC APIs.
Setting up an API
Mapserver is server software which is able to expose datasets through various APIs. Examples of similar software are QGIS server, ArcGIS Server, Geoserver and pygeoapi.
We’ve selected mapserver for this training, because of its robustness and low resource consumption. Mapserver is configured using a configuration file: called the mapfile. The mapfile defines metadata for the dataset and how users interact with the dataset, mainly the colour scheme (legend) to draw a map of the dataset.
Various tools exist to write these configuration files, such as Mapserver studio, GeoStyler, QGIS Bridge, up to a visual studio plugin to edit mapfiles.
The GeoDataCrawler, introduced in a previous paragraph, also has an option to generate mapfiles. A big advantage of this approach is the integration with existing metadata. GeoDataCrawler will, during mapfile generation, use the existing metadata, but also update the metadata so it includes a link to the mapserver service endpoint. This step enables a typical workflow of:
- Users find a dataset in a catalogue
- Then open the dataset via the linked service
But also vice versa; from a mapping application, access the metadata describing a dataset.
Mapfile creation exercise
- Navigate with shell to a folder with data files.
- Verify if mcf’s are available for the files, if not, create initial metadata with
crawl-metadata --mode=init --dir=.
- Add a index.yml file to the folder. This metadata is introduced in the mapfile to identify the service.
mcf:
version: 1.0
identification:
title: My new mapservice
abstract: A map service for data about ...
contact:
pointOfContact:
organization: ISRIC
email: info@isric.org
url: https://www.isric.org
- Set some environment variables
export pgdc_md_url="https://kenya.lsc-hubs.org/collections/metadata:main/items/{0}"
export pgdc_ms_url="http://localhost"
export pgdc_webdav_url="https://example.com/data"
$pgdc_md_url="https://kenya.lsc-hubs.org/collections/metadata:main/items/{0}"
$pgdc_ms_url="http://localhost"
$pgdc_webdav_url="https://example.com/data"
- Generate the mapfile
crawl-maps --dir=.
docker run -it --rm -v$(pwd):/tmp \
--dir=/tmp org/metatraining crawl-maps
docker run -it --rm -v "${PWD}:/tmp" `
org/metatraining crawl-maps --dir=/tmp
- Index.yml may include a “robot” property, to guide the crawler in how to process the folder. This section can be used to add specific crawling behaviour.
mcf:
version: 1.0
robot:
skip-subfolders: True # indicates the crawler not to proceed in subfolders
You can test this mapfile locally if you have mapserver installed. On windows, consider using conda or ms4w.
conda install -c conda-forge mapserver
Mapserver includes a map2img utility, which enables to render a map image from any mapfile.
map2img -m=./mymap.map -o=test.png
Setup mapserver via Docker
For this exercise we’re using a mapserver image available from Docker hub.
docker pull camptocamp/mapserver:master
First create a config file, which we’ll mount as a volume into the container. On this config file we list all the mapfiles we aim to publish on our container. Download the default config file. Open the file and unescape and populate the maps section:
MAPS
"data" "/srv/data/data.map"
END
Also unescape the OGCAPI templates section
OGCAPI_HTML_TEMPLATE_DIRECTORY "/usr/local/share/mapserver/ogcapi/templates/html-bootstrap4/"
In the next statement we mount the data folder, including the config file and indicate on which port and with which config file the container will run:
docker run -p 80:80 \
-e MAPSERVER_CONFIG_FILE=/srv/data/mapserver.conf \
-v $(pwd):/srv/data \
camptocamp/mapserver:master
docker run -p 80:80 `
-e MAPSERVER_CONFIG_FILE=/srv/data/mapserver.conf `
-v "${PWD}:/srv/data" `
camptocamp/mapserver:master
Check http://localhost/data/ogcapi in your browser. If all has been set up fine it should show the OGCAPI homepage of the service. If not, check the container logs to evaluate any errors.
You can also try the url in QGIS. Add a WMS layer, of service http://localhost/data?request=GetCapabilities&service=WMS.
Notice the links to metadata when you open GetCapabilities in a browser.
In recent years browsers have become more strict, to prevent abuse. For that reason it is important to carefully consider common connectivity aspects, when setting up a new service. Websites running at https can only embed content from other https services, so using https is relevant. CORS and CORB can limit access to embedded resources from remote servers. Using proper CORS headers and Content type identification, is relevant to prevent CORS and CORB errors.
GeoDataCrawler uses default (gray) styling for vector and an average classification for grids. You can finetune the styling of layers through the robot section in index.yml.
Summary
In this paragraph the standards of Open Geospatial Consortium have been introduced and how you can publish your data according to these standards using Mapserver. In the next section we’ll look at measuring service quality.