WGET and CURL - Direct command line tools

wget and curl are used to put a request directly on the command line

 

WGET

To make a TAP request with wget

  1. take the https form of the request (data doesn't change, but metadata requests need URL encoding - see metadata page)

  2. put it in double quotes (make sure 'smart quotes' are not enabled)

  3. add one of the following to the beginning:

wget --content-disposition

or

wget -O myfilename.tgz or .csv

The content disposition option will give it an appropriate name and extension. The -O (O for Oscar) option will name the file with the extension you give it, so take care that the extension is the correct one.

For example, a data request will return a .tgz file:

wget -O myDataRequest.tgz "https://csa.esac.esa.int/csa-sl-tap/data?RETRIEVAL_TYPE=product&DATASET_ID=C1_CP_FGM_SPIN&START_DATE=2003-03-03T00:00:00Z&END_DATE=2003-03-05T00:00:00Z"

While a metadata request will return the format requested:

wget -O myMetadataRequest.csv "https://csa.esac.esa.int/csa-sl-tap/tap/sync?REQUEST=doQuery&LANG=ADQL&FORMAT=CSV&QUERY=SELECT+dataset_id,measurement_types+FROM+csa.v_dataset+WHERE+measurement_types+like+'%25Electric_Field%25'"

Note that wget has a default timeout of 900 seconds (15 minutes) and so if the request is complex, it's possible that it may timeout. In this case, it's safer to use an asynchronous request as detailed below.

 

Logging in for asynchronous request

Asynchronous requests require a login (register if you don't have a login) and for wget this means obtaining a cookie file. Use the following syntax replacing COOKIEFILE with your preferred path and filename, and YOURUSERID and YOURPASSWORD appropriately:

wget --keep-session-cookies --save-cookies COOKIEFILE --post-data 'username=YOURUSERID&password=YOURPASSWORD' "https://csa.esac.esa.int/csa-sl-tap/login" 

Note that if your password contains special characters, you might need to replace those characters with the URL encoded equivalent, e.g., replace & with %26

Once you have the cookie file, then you can make your asynchronous request using this cookie file and adding RETRIEVAL_ACCESS=DEFERRED, for example:

wget --load-cookies cookies.txt "https://csa.esac.esa.int/csa-sl-tap/data?RETRIEVAL_TYPE=product&RETRIEVAL_ACCESS=DEFERRED&DATASET_ID=C1_CE_WBD_WAVEFORM_BM2_CDF&START_DATE=2020-09-02T04:20:00Z&END_DATE=2020-09-02T06:10:00Z&delivery_interval=TenMin&delivery_format=cdf" -O request_response.xml

Once the request has been made, an email will be sent to the registered address with a link to the data itself.

 

For more information on using scripted access to this data link, see the Asynchronous Data Requests page.

Check the job using the URL in the <uws:parameter id="email_base_url"> field, e.g.,:

wget --load-cookies cookies.txt "https://csa.esac.esa.int/csa-sl-tap/tap/async/1651077797541OPE" -O request_response.xml

and once the phase is COMPLETED, the data is downloadable from the URL in the <uws:result> field, e.g.,:

wget -O myDataRequest.tgz --load-cookies cookies.txt "https://csa.esac.esa.int/csa-sl-tap/tap/async/1651077797541OPE/results/hmiddlet1651077797557"

 

CURL

curl is an alternative to wget, which can download files from HTTP requests, but can also print the results of a metadata request to the screen, which can be handy for quick queries. It also doesn't have a default timeout, like wget.

To print metadata results to screen, simply put the request in double quotes and put curl on the front:

curl "https://csa.esac.esa.int/csa-sl-tap/tap/sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+dataset_id+FROM+csa.v_dataset+WHERE+experiments='PEACE'"

To write that metadata to a file, add --output <filename> to the command, taking care to match up the filename extension to the file format requested, i.e., in this case, .json.

curl --output metadata.json "https://csa.esac.esa.int/csa-sl-tap/tap/sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+dataset_id+FROM+csa.v_dataset+WHERE+experiments='PEACE'"

The same method works for data to a tgz file:

curl --output myRequest.tgz "https://csa.esac.esa.int/csa-sl-tap/data?RETRIEVAL_TYPE=product&DATASET_ID=C1_CP_FGM_SPIN&START_DATE=2003-03-03T00:00:00Z&END_DATE=2003-03-05T00:00:00Z"

 

Logging in for asynchronous request

The curl version of the authentication syntax is (replacing COOKIEFILE, YOURUSERID and YOURPASSWORD):

curl -k -c COOKIEFILE -X POST -d username=YOURUSERID -d password=YOURPASSWORD -L https://csa.esac.esa.int/csa-sl-tap/login

and the asynchronous command syntax is given below:

curl -b COOKIEFILE -L "https://csa.esac.esa.int/csa-sl-tap/data?RETRIEVAL_TYPE=product&RETRIEVAL_ACCESS=DEFERRED&DATASET_ID=C1_CE_WBD_WAVEFORM_BM2_CDF&START_DATE=2020-09-02T04:20:00Z&END_DATE=2020-09-02T06:10:00Z&delivery_interval=TenMin&delivery_format=cdf" -o response.xml

The returned file is the XML response discussed on the Asynchronous Data Requests page. 

To update the XML file to retrieve the data URL once the job is complete, use the URL in the <uws:parameter id="email_base_url"> field, e.g.,:

curl -k -b cookies.txt -L https://csa.esac.esa.int/csa-sl-tap/tap/async/1651076536218OPE -o response.xml

and once the phase is COMPLETED, use the URL in the uws:result field to retrieve the data, e.g.,:

curl -k -b cookies.txt -L --output myRequest.tgz "https://csa.esac.esa.int/csa-sl-tap/tap/async/1651076536218OPE/results/hmiddlet1651076536234"