Downloading data from WIS2 using wis2downloader
Learning outcomes!
By the end of this practical session, you will be able to:
- use the "wis2downloader" to subscribe to WIS2 data notifications and download data to your local system
- view the status of the downloads in the Grafana dashboard
- learn how to configure the wis2downloader to subscribe to a non-default broker
Introduction
In this session you will learn how to setup a subscription to a WIS2 Broker and automatically download data to your local system using the "wis2downloader"-service included in the wis2box.
About wis2downloader
The wis2downloader is also available as a standalone service that can be run on a different system from the one that is publishing the WIS2 notifications. See wis2downloader for more information for using the wis2downloader as a standalone service.
If you like to develop your own service for subscribing to WIS2 notifications and downloading data, you can use the wis2downloader source code as a reference.
Preparation
Before starting please login to your student VM and ensure your wis2box instance is up and running.
wis2downloader basics
The wis2downloader is included as a separate container in wis2box as defined in the Docker Compose files. The Prometheus container in wis2box is configured to scrape metrics from the wis2downloader container and these metrics can be visualized by a dashboard in Grafana.
Viewing the wis2downloader dashboard in Grafana
Open a web browser and navigate to the Grafana dashboard for your wis2box instance by going to http://YOUR-HOST:3000
.
Click on dashboards in the left-hand menu, and then select the wis2downloader dashboard.
You should see the following dashboard:
This dashboard is based on metrics published by the wis2downloader service and will show you the status of the downloads that are currently in progress.
On the top left corner you can see the subscriptions that are currently active.
Keep this dashboard open as you will use it to monitor the download progress in the next exercise.
Reviewing the wis2downloader configuration
The wis2downloader service in wis2box can be configured using the environment variables defined in your wis2box.env
file.
The following environment variables are used by the wis2downloader:
- DOWNLOAD_BROKER_HOST: The hostname of the MQTT broker to connect to. Defaults to globalbroker.meteo.fr
- DOWNLOAD_BROKER_PORT: The port of the MQTT broker to connect to. Defaults to 443 (HTTPS for websockets)
- DOWNLOAD_BROKER_USERNAME: The username to use to connect to the MQTT broker. Defaults to everyone
- DOWNLOAD_BROKER_PASSWORD: The password to use to connect to the MQTT broker. Defaults to everyone
- DOWNLOAD_BROKER_TRANSPORT: websockets or tcp, the transport-mechanism to use to connect to the MQTT broker. Defaults to websockets,
- DOWNLOAD_RETENTION_PERIOD_HOURS: The retention period in hours for the downloaded data. Defaults to 24
- DOWNLOAD_WORKERS: The number of download workers to use. Defaults to 8. Determines the number of parallel downloads.
- DOWNLOAD_MIN_FREE_SPACE_GB: The minimum free space in GB to keep on the volume hosting the downloads. Defaults to 1.
To review the current configuration of the wis2downloader, you can use the following command:
cat ~/wis2box/wis2box.env | grep DOWNLOAD
Review the configuration of the wis2downloader
What is the default MQTT broker that the wis2downloader connects to?
What is the default retention period for the downloaded data?
Click to reveal answer
The default MQTT broker that the wis2downloader connects to is globalbroker.meteo.fr
.
The default retention period for the downloaded data is 24 hours.
Updating the configuration of the wis2downloader
To update the configuration of the wis2downloader, you can edit the wis2box.env file. To apply the changes you can re-run the start command for the wis2box-stack:
python3 wis2box-ctl.py start
And you will see the wis2downloader service restart with the new configuration.
You can keep the default configuration for the next exercise.
wis2downloader command line interface
To access the command line interface of the wis2downloader within wis2box, you can login to the wis2downloader container using the following command:
python3 wis2box-ctl.py login wis2downloader
Use the following command to list the subscriptions that are currently active:
wis2downloader list-subscriptions
This command returns an empty list, as there are no subscriptions setup yet.
Download GTS-data using a WIS2 Global Broker
If you kept the default configuration of the wis2downloader, it is currently connected to the WIS2 Global Broker hosted by Météo-France.
Setup the subscription
Use the following command cache/a/wis2/de-dwd-gts-to-wis2/#
, to subscribe to data published by the GTS-to-WIS2 gateway hosted by DWD, made available through the Global Caches:
wis2downloader add-subscription --topic cache/a/wis2/de-dwd-gts-to-wis2/#
Then exit the wis2downloader container by typing exit
:
exit
Check the downloaded data
Check the wis2downloader dashboard in Grafana to see the new subscription added. Wait a few minutes and you should see the first downloads starting. Go to the next exercise once you have confirmed that the downloads are starting.
The wis2downloader service in wis2box downloads the data in the 'downloads' directory in the directory you defined as the WIS2BOX_HOST_DATADIR
in your wis2box.env
file. To view the contents of the downloads directory, use the following command:
ls -R ~/wis2box-data/downloads
Note that the downloaded data is stored in directories named after the topic the WIS2 Notification was published on.
Viewing the downloaded data
What directories do you see in the downloads directory?
Can you see any downloaded files in these directories?
Click to reveal answer
You should see a directory structure starting with cache/a/wis2/de-dwd-gts-to-wis2/
underneath which you will see more directories named after the GTS-bulletin headers of the downloaded data.
Depending on when you started the subscription, you may or may not see any downloaded files in this directory yet. If you do not see any files yet, wait a few more minutes and check again.
Let's cleanup the subscription and the downloaded data before moving to the next exercise.
Log back in to the wis2downloader container:
python3 wis2box-ctl.py login wis2downloader
and remove the subscription you made from the wis2downloader, using the following command:
wis2downloader remove-subscription --topic cache/a/wis2/de-dwd-gts-to-wis2/#
Remove the downloaded data using the following command:
rm -rf /wis2box-data/downloads/cache/*
And exit the wis2downloader container by typing exit
:
exit
Check the wis2downloader dashboard in Grafana to see the subscription removed. You should see the downloads stopping.
About GTS-to-WIS2 gateways
There are currently two GTS-to-WIS2 gateways currently publishing data through the WIS2 Global Broker and the Global Caches:
- DWD (Germany): centre-id=de-dwd-gts-to-wis2
- JMA (Japan): centre-id=jp-jma-gts-to-wis2
If in the previous exercise you replace de-dwd-gts-to-wis2
with jp-jma-gts-to-wis2
, you would receive the notifications and data published by the JMA GTS-to-WIS2 gateway.
Origin vs cache topics
When subscribing to a topic starting with origin/
, you will receive notifications with a canonical URL that points to a data server provided by the WIS Centre publishing the data.
When subscribing to a topic starting with cache/
, you will receive multiple notifications for the same data, one for each Global Cache. Each notification will contain a canonical URL that points to the data server of the respective Global Cache. The wis2downloader will download the data from the first canonical URL it can reach.
Download example data from the WIS2 Training Broker
In this exercise, you will subscribe to the WIS2 Training Broker which is publishing example data for training purposes.
Change the wis2downloader configuration
This demonstrates how to subscribe to a broker that is not the default broker and will allow you to download some data published from the WIS2 Training Broker.
Edit the wis2box.env
file and change DOWNLOAD_BROKER_HOST
to wis2training-broker.wis2dev.io
, DOWNLOAD_BROKER_PORT
to 1883
and DOWNLOAD_BROKER_TRANSPORT
to tcp
:
# downloader settings
DOWNLOAD_BROKER_HOST=wis2training-broker.wis2dev.io
DOWNLOAD_BROKER_PORT=1883
DOWNLOAD_BROKER_USERNAME=everyone
DOWNLOAD_BROKER_PASSWORD=everyone
# download transport mechanism (tcp or websockets)
DOWNLOAD_BROKER_TRANSPORT=tcp
Then run the 'start' command again to apply the changes:
python3 wis2box-ctl.py start
Check the logs of the wis2downloader to see if the connection to the new broker was successful:
docker logs wis2downloader
You should see the following log message:
...
INFO - Connecting...
INFO - Host: wis2training-broker.wis2dev.io, port: 1883
INFO - Connected successfully
Setup new subscriptions
Now we will setup a new subscription to the topic to downloaded cyclone track data from the WIS2 Training Broker.
Login to the wis2downloader container:
python3 wis2box-ctl.py login wis2downloader
And execute the following command (copy-paste this to avoid typos):
wis2downloader add-subscription --topic origin/a/wis2/int-wis2-training/data/core/weather/prediction/forecast/medium-range/probabilistic/trajectory
Exit the wis2downloader container by typing exit
.
Check the downloaded data
Wait until you see the downloads starting in the wis2downloader dashboard in Grafana.
Check that the data was downloaded by checking the wis2downloader logs again with:
docker logs wis2downloader
You should see a log message similar to the following:
[...] INFO - Message received under topic origin/a/wis2/int-wis2-training/data/core/weather/prediction/forecast/medium-range/probabilistic/trajectory
[...] INFO - Downloaded A_JSXX05ECEP020000_C_ECMP_...
Check the contents of the downloads directory again:
ls -R ~/wis2box-data/downloads
You should see a new directory named origin/a/wis2/int-wis2-training/data/core/weather/prediction/forecast/medium-range/probabilistic/trajectory
containing the downloaded data.
Review the downloaded data
What is the file format of the downloaded data?
Click to reveal answer
The downloaded data is in BUFR format as indicated by the .bufr
file extension.
Next try to add another two subscriptions to download monthly surface temperature anomalies and global ensemble forecast data from the following topics:
origin/a/wis2/int-wis2-training/data/core/weather/prediction/forecast/medium-range/probabilistic/global
origin/a/wis2/int-wis2-training/data/core/climate/experimental/anomalies/monthly/surface-temperature
Wait until you see the downloads starting in the wis2downloader dashboard in Grafana.
Check the contents of the downloads directory again:
ls -R ~/wis2box-data/downloads
You should see the new directories corresponding to the topics you subscribed to, containing the downloaded data.
Conclusion
Congratulations!
In this practical session, you learned how to:
- use the 'wis2downloader' to subscribe to a WIS2 Broker and download data to your local system
- view the status of the downloads in the Grafana dashboard
- how to change the default configuration of the wis2downloader to subscribe to a different broker
- how to view the downloaded data on your local system