.. _developers_reference: Reference ========= Technical reference documentation that may be useful. Architecture ------------ The GPP Publicatiebank is set up as a rather simple RESTful JSON API, consisting of two major parts: * Metadata * Publications **Metadata** The metadata endpoints allow reading (and in some cases writing) of metadata used by/in publications, such as information categories (defined in legislation), available organisations and themes. Some of the metadata is defined by a national body and published in JSON+LD lists. These are automatically distributed and loaded in GPP Publicatiebank instances. Via the admin, custom items can be added and modified, but the official records may not be modified. **Publications** Publications are containers for documents - they hold metadata and one or more documents that have been (or are in the process of) been made publicly accessible. The metadata is stored in a relations database (see the schema below), while the actual binary content of the documents is persisted in a Documenten API (VNG standard). Typically, we leverage Open Zaak for this (in development). Publication and document resources have a particular publication status - at the time of writing these are ``concept``, ``published`` and ``revoked``. Published resources are pushed to the GPP-zoeken API to be included in an Elastic Search index to power the GPP Burgerportaal for search. When a resource is revoked, it is removed from the index again. Index operations (and by extension the interaction with GPP-zoeken) is done asynchronously using Celery tasks, after the database transaction is committed. Document uploads ---------------- Documents within a publication hold pointers to binary content, representing the actual downloadable document being published. There are two ways to provide documents to the publicatiebank. The publicatiebank persists these documents in a Documents API service, which is :ref:`configured in the global settings `. Direct uploads ~~~~~~~~~~~~~~ In its simplest form, frontends/users can upload a document from disk to the publicatiebank. In this scenario, the metadata of the document must be created first, after which the expected file parts must be uploaded. Multiple file parts can be uploaded in parallel. The publicatiebank ensures the file is stored in the underlying Documents API. While the document is being uploaded, requests to the download endpoint will respond with an HTTP 409 (Conflict) status. Remote document retrieval ~~~~~~~~~~~~~~~~~~~~~~~~~ Alternatively, a document can be retrieved from a(nother) Documents API. GPP-publicatiebank will then proceed to download it and save a copy in its own configured Documents API. The source system can then safely destroy the document as part of archiving requirements. Using this approach, the API client indicates that a document URL will be provided, and it must provide the document URL where the file can be downloaded. GPP-publicatiebank expects to receive binary content when retrieving this URL with a GET request, matching the behaviour of the `Documents API `_ standard. The download from the remote URL and upload to the own Documents API take place in a background job. While this job is pending, attempting to download the document will result in an HTTP 409 (Conflict) response. Clients can refresh/poll the ``Document`` detail endpoint and check the ``uploadComplete`` property. .. note:: Any services/endpoints from which documents may be downloaded *must* be registered in the :ref:`admin interface ` *before* attempting to use them, so that: * the API credentials have been sorted out * server-side request forgeries are mitigated Database schema --------------- You can right-mouse button click and then click "open in new tab" to see the image in full resolution. .. graphviz:: _assets/db_schema.dot