Aller au contenu

Setting up a recommended dataset with access control

Learning outcomes

By the end of this practical session, you will be able to:

  • create a new dataset with data policy 'recommended'
  • add an access token to the dataset
  • validate the dataset can not be accessed without the access token
  • add the access token to HTTP headers to access the dataset

Introduction

Datasets that are not considered 'core' dataset in WMO can optionally be configured with an access control policy. wis2box provides a mechanism to add an access token to a dataset which will prevent users from downloading data unless they supply the access token in the HTTP headers.

Preparation

Ensure you have SSH access to your student VM and that your wis2box instance is up and running.

Make sure you are connected to the MQTT broker of your wis2box instance using MQTT Explorer. You can use the public credentials everyone/everyone to connect to the broker.

Ensure you have a web browser open with the wis2box-webapp for your instance by going to http://<your-host>/wis2box-webapp.

Go to the 'dataset editor' page in the wis2box-webapp and create a new dataset. Use the same centre-id as in the previous practical sessions and use the template='surface-weather-observations/synop'.

You may get a pop-up message that there already is a dataset with the same metadata identifier:

provide-a-new-dataset-id

Click 'OK' to proceed.

In the dataset editor, set the data policy to 'recommended' (note that this will update the identifier and replace 'core' with 'reco') and fill all the required fields.

Submit the dataset and check that the new dataset is created in the wis2box-webapp.

Check MQTT-explorer to see that you receive the WIS2 Notification Message announcing the new Discovery Metadata record on the topic origin/a/wis2/<your-centre-id>/metadata.

Exercise 2: add an access token to the dataset

Login to the wis2box-management container,

cd ~/wis2box-1.0b8
python3 wis2box-ctl.py login

From command line inside the container you can secure a dataset using the wis2box auth add-token command, using the flag --metadata-id to specify the metadata-identifier of the dataset and the access token as an argument.

For example, to add the access token S3cr3tT0k3n to the dataset with metadata-identifier urn:wmo:md:not-my-centre:core.surface-based-observations.synop:

wis2box auth add-token --metadata-id urn:wmo:md:not-my-centre:reco.surface-based-observations.synop S3cr3tT0k3n

Exit the wis2box-management container:

exit

Exercise 3: publish some data to the dataset

Copy the file exercise-materials/access-control-exercises/aws-example2.csv to the directory defined by WIS2BOX_HOST_DATADIR in your wis2box.env:

cp ~/exercise-materials/access-control-exercises/aws-example2.csv ~/wis2box-data

Then use WinSCP or a command line editor to edit the file aws-example2.csv and update the WIGOS-station-identifiers in the input-data to match the stations you have in your wis2box instance.

Next, go to the station-editor in the wis2box-webapp. For each station you used in aws-example2.csv, update the 'topic' field to match the 'topic' of the dataset you created in the previous exercise.

This station will now be associated to 2 topics, one for the 'core' dataset and one for the 'recommended' dataset:

edit-stations-add-topics

You will need to use your token for collections/stations to save the updated station data.

Next, login to the wis2box-management container:

cd ~/wis2box-1.0b8
python3 wis2box-ctl.py login

From the wis2box command line we can ingest the sample data file aws-example2.csv into a specific dataset as follows:

wis2box data ingest -p /data/wis2box/aws-example2.csv --metadata-id urn:wmo:md:not-my-centre:reco.surface-based-observations.synop

Make sure to provide the correct metadata-identifier for your dataset and check that you receive WIS2 data-notifications in MQTT Explorer, on the topic origin/a/wis2/<your-centre-id>/data/recommended/surface-based-observations/synop.

Check the canonical link in the WIS2 Notification Message and copy/paste the link to the browser to try and download the data.

You should see a 403 Forbidden error.

Exercise 5: add the access token to HTTP headers to access the dataset

In order to demonstrate that the access token is required to access the dataset we will reproduce the error you saw in the browser using the command line function wget.

From the command line in your student VM, use the wget command with the canonical-link you copied from the WIS2 Notification Message.

wget <canonical-link>

You should see that the HTTP request returns with 401 Unauthorized and the data is not downloaded.

Now add the access token to the HTTP headers to access the dataset.

wget --header="Authorization: Bearer S3cr3tT0k3n" <canonical-link>

Now the data should be downloaded successfully.

Conclusion

Congratulations!

In this practical session, you learned how to:

  • create a new dataset with data policy 'recommended'
  • add an access token to the dataset
  • validate the dataset can not be accessed without the access token
  • add the access token to HTTP headers to access the dataset