Exercise 1 - Your first dataset
In this section you are going to publish a vector dataset.
For this exercise, we will use a CSV dataset of Bathing waters in Estonia, kindly provided by Estonian Health Board.
You can find this dataset in workshop/exercises/data/tartu/bathingwater-estonia.csv
.
This exercise consists of adjusting workshop/exercises/pygeoapi.config.yml
to define this dataset as an OGC API - Features collection
Verify the existing Docker Compose config
Before making any changes, we will make sure that the initial Docker Compose setup provided to you is actually working.
To test:
Test the workshop configuration
- In a terminal shell navigate to the workshop folder and type:
cd workshop/exercises
docker compose up
cd workshop/exercises
docker compose up
- Open http://localhost:5000 in your browser, verify some collections
- Close by typing
CTRL-C
Note
You may also run the Docker container in the background (detached) as follows:
docker compose up -d
docker ps # verify that the pygeoapi container is running
# visit http://localhost:5000 in your browser, verify some collections
docker logs --follow pygeoapi # view logs
docker compose down --remove-orphans
docker compose up -d
docker ps # verify that the pygeoapi container is running
# visit http://localhost:5000 in your browser, verify some collections
docker logs --follow pygeoapi # view logs
docker compose down --remove-orphans
Publish first dataset
You are now ready to publish your first dataset.
Setting up the pygeoapi config file
- Open the file
workshop/exercises/pygeoapi/pygeoapi.config.yml
in your text editor - Look for the commented config section starting with
# START - EXERCISE 1 - Your First Collection
- Uncomment all lines until
# END - EXERCISE 1 - Your First Collection
Make sure that the indentation aligns (hint: directly under # START ...
)
The config section reads:
185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
|
The most relevant part is the providers
section. Here, we define a CSV Provider
,
pointing the file path to the /data
directory we will mount (see next) from the local
directory into the Docker container above. Because a CSV is not a spatial file, we explicitly
configure pygeoapi so that the longitude and latitude (x and y) is mapped from the columns lon
and lat
in the CSV file. Notice the storage_crs
parameter, which indicates the coordinate system which is used in the source data.
Tip
To learn more about the pygeoapi configuration syntax and conventions see the relevant chapter in the documentation.
Tip
pygeoapi includes numerous data providers which enable access to a variety of data formats. Via the OGR/GDAL plugin the number of supported formats is almost limitless. Consult the data provider page how you can set up a connection to your dataset of choice. You can always copy a relevant example configuration and place it in the datasets section of the pygeoapi configuration file for your future project.
Test
Start with updated configuration
- Start by typing
docker compose up
- Observe logging output
- If no errors: open http://localhost:5000
- Look for the Point of interest collection
- Browse through the items of the collection
- Check the json representation by adding ?f=json to url (or click 'json' in top right)
Debugging configuration errors
Incidentally you may run into errors, briefly discussed here:
- A file cannot be found, a typo in the configuration
- The format or structure of the spatial file is not fully supported
- The port (5000) is already taken. Is a previous pygeoapi still running? If you change the port, consider that you also have to update the pygeoapi config file
There are two parameters in the configuration file which help to address these issues.
Set the logging level to DEBUG
and indicate a path to a log file.
Tip
On Docker, set the path of the logfile to the mounted folder, so you can easily access it from your host system. You can also view the console logs from your Docker container as follows:
docker logs --follow pygeoapi
docker logs --follow pygeoapi
Tip
Errors related to file paths typically happen on initial setup. However, they may also happen at unexpected moments, resulting in a broken service. Products such as GeoHealthCheck aim to monitor, detect and notify service health and availability. The OGC API - Features tests in GeoHealthCheck poll the availability of the service at intervals. Consult the GeoHealthCheck documentation for more information.