In this tutorial we'll collect Ordnance Survey data from the Downloads API with an automated download and extract process.
We will walk through how to fetch the OS Terrain50 Digital Elevation Model (DEM) dataset from the OS Downloads API using NodeJS and the command line, then unzipping and extracting all of the appropriate .asc
files into a single folder.
We will be using:
- NodeJS and npm, with axios and extract-zip npm packages;
fs
andpath
standard modules. - Command line
Working with large datasets can be a challenge. With a few programming tools, however, we can radically improve the efficiency of collecting and manipulating these datasets.
We'll be using NodeJS to download and extract Ordnance Survey's Terrain50 Digital Elevation Model dataset from the OS Downloads API.
- Download zipped directory with the data.
- Unzip all contained zipped directories.
- Copy all
.asc
files into one folder, ready to be loaded into QGIS for our Shaded Relief Map tutorial.
This tutorial was created using NodeJS v14.1.0. If you don't already have Node installed, here is a great tutorial on installing Node and npm on Windows and Mac.
This tutorial will walk you through creating this code from scratch. If you would rather download, install and run it all, there are instructions for that at the bottom of this page.
First, we'll download the Terrain 50 dataset from the OS Downloads API. In your terminal, make a directory where you'll want to store your project. At the outset we'll install the axios
package from NPM, which is used for HTTP requests, and extract-zip
, which we'll use a bit later on for extracting zipped folders.
mkdir os-downloads-tutorial
cd os-downloads-tutorial
npm install axios extract-zip
If you're not familiar with Node, this npm install {package names}
command will create a node_modules
folder and download packages there.
We'll be breaking our Node program into modules to keep it organised. For our first step - downloading the dataset - create a file called downloadFile.js
in the os-downloads-tutorial
directory.
We'll be using the fs
module to write data to the disk as it downloads via an axios
HTTP request.
const fs = require("fs");
const path = require("path");
const axios = require("axios");
const extract = require("extract-zip");
/* ============================================================
Function: Uses Axios to download file as stream using Promise
/downloadFiles.js
============================================================ */
const download_file = (url, filename) =>
axios({
url,
responseType: "stream"
}).then(
(response) =>
new Promise((resolve, reject) => {
response.data
.pipe(fs.createWriteStream(filename))
.on("finish", () => resolve())
.on("error", (e) => reject(e));
})
);
This function will fetch the resource at the url
passed in, and write the file returned to the local directory as filename
. We will call it from the function we'll export, adding in some error handling and visual feedback so the user knows the file is downloading. We'll also unzip the downloaded file into the targetdir
.
In the same file, below the download_file
declaration, create the following function:
/* ============================================================
Download File
/downloadFiles.js
============================================================ */
async function downloadFile(url, targetdir) {
try {
// Giving user ongoing feedback in the terminal:
console.log("Download starting ...");
let interval = setInterval(() => console.log("..."), 5000);
let targetfile = path.resolve(targetdir + ".zip");
// Wait until the file is fully downloaded
await download_file(url, targetfile);
// Now make the target directory, extract the zipped file into it, and delete the downloaded zipfile.
await fs.mkdirSync(targetdir);
await extract(targetfile, { dir: path.resolve(targetdir) });
await fs.unlinkSync(targetfile);
// Complete!
clearInterval(interval);
console.log("Completed downloading files");
} catch (error) {
console.error(error);
}
}
module.exports = downloadFile; // <- so we can import the function into another script
Before we import and call the code above, we'll also define a script that will unzip all the zipped folders in the downloaded zipfile.
Create a file called unzipAll.js
. This function will accept one parameter - a path to a directory (zipped or not). We'll then pull paths of all contained zipfiles, extract the contents, and delete the zipfiles. Since zipped directories can contain zipped sub-directories and files, we need to execute this recursively until all files ending in .zip
are extracted and deleted.
// Import dependencies, including the extract-zip module we installed with npm
const fs = require("fs");
const path = require("path");
const extract = require("extract-zip");
const getFilePaths = require("./getFilePaths.js");
async function unzipAll(dir) {
dir = path.resolve(dir);
// Create an array of all .zip file paths in the directory:
let filepaths = getFilePaths(dir);
filepaths = filepaths.filter((filepath) => path.extname(filepath) === ".zip");
// Loop through filepaths, extracting and deleting each:
while (filepaths.length > 0) {
for (let file of filepaths) {
let parsedpath = path.parse(file);
let targetdir = path.resolve(parsedpath.dir);
try {
// Unzip the compressed file into a directory with the same name
await extract(file, { dir: targetdir });
await fs.unlinkSync(file); // <- delete the .zip file
console.log("Unzipped", parsedpath.base + ", deleted .zip file.");
} catch (err) {
console.log(err);
}
}
// If the extracted folders contained zipfiles, they'd be included here:
filepaths = getFilePaths(dir);
filepaths = filepaths.filter(
(filepath) => path.extname(filepath) === ".zip"
);
}
}
// Since we'll be importing it into our main JS file, we export the function:
module.exports = unzipAll;
You'll notice we required a module to unzipAll
- getFilePaths.js
.
This function accepts a directory path string, then loops through the directory, building an array of all the paths to directories and files contained.
A note: this is another recursive function - one that calls itself. In this way it is able to move through the directory structure no matter its depth.
const fs = require("fs");
const path = require("path");
/** Retrieve file paths from a given folder and its subfolders. */
/* Big thanks to @darioblanco on https://gist.github.com/kethinov/6658166
for sharing this code!! */
const getFilePaths = (folderPath) => {
// Edge case: if a .zip file is passed in, we'll return that ready to unzip
if (
fs.statSync(folderPath).isFile() &&
path.parse(folderPath).ext === ".zip"
) {
return [folderPath];
}
// An array of path strings for the contents of the directory
const entryPaths = fs
.readdirSync(folderPath)
.map((entry) => path.join(folderPath, entry));
// An array of all the files in the directory
const filePaths = entryPaths.filter((entryPath) =>
fs.statSync(entryPath).isFile()
);
// An array of all the directories in the directory
const dirPaths = entryPaths.filter(
(entryPath) => !filePaths.includes(entryPath)
);
// Recursively travel down the directory tree, concatenating the contents of each subdirectory
// dirFiles contains all the files with a directory 'depth' greater than one.
const dirFiles = dirPaths.reduce(
(prev, curr) => prev.concat(getFilePaths(curr)),
[]
);
// Combine all the paths into a single array with the ES6 spread operator
return [...filePaths, ...dirFiles];
};
module.exports = getFilePaths;
Now we've downloaded the DEM data from the OS Downloads API and unzipped all the contained files. It should be ready to bring into QGIS.
Let's put this all together in a new file, /app.js
. We'll import the modules we need - Node's fs
and path
, along with the custom modules we wrote above.
We don't want to have to click through every single folder to select the .asc
files we need to load into QGIS - and QQGIS won't accept a directory containing loads of subdirectories and files of different types.
Fortunately, with Node - and the code we've already written - we can easily copy all the .asc
files from their locations in the directory structure into one folder. (This could be done with many other programming languages like Python, bash, Ruby, C++ - the power of code.)
We will place all of this code inside the body of an asynchronous immediately-invoked function expression, which you can see in /app.js
.
By wrapping the function expression in parentheses, we can immediately invoke it (with or without arguments): (function (param) { /* body ...*/})(arg)
This pattern is necessary for us to be able to use async / await
syntax, which makes it clean and easy to write code that will halt until the end of processes like downloading a large file.
// app.js
const fs = require("fs").promises;
const path = require("path");
const getFilePaths = require("./getFilePaths.js");
const downloadFile = require("./downloadFile.js");
const unzipAll = require("./unzipAll.js");
// We use an IIFE because`await` can only work inside an `async` function.
(async () => {
const terrain50url =
"https://osdatahubapi.os.uk/downloads/v1/products/Terrain50/downloads?area=GB&format=ASCII+Grid+and+GML+%28Grid%29&redirect";
const targetDir = "./working_data";
// // Await download and unzip:
await downloadFile(terrain50url, targetDir);
await unzipAll(targetDir);
// Now we have a directory with several subdirectories containing, among other files, .asc grids representing elevations of 50m raster cells.
// Let's extract an array of all paths then filter .asc files in the NG grid square:
let allPaths = getFilePaths(targetDir);
let ascPaths = allPaths.filter(
(filepath) =>
path.parse(filepath).ext === ".asc" && filepath.includes("/ng/")
);
// We'll create a directory to hold our .asc files
let ascTarget = path.resolve(targetDir, "asc_skye/");
await fs.mkdir(ascTarget);
// Then loop through and copy each file into this ./asc folder
for (let i = 0; i < ascPaths.length; i++) {
let parsedpath = path.parse(ascPaths[i]);
let target = path.resolve(ascTarget, parsedpath.base);
await fs.copyFile(ascPaths[i], target);
console.log("Copied", parsedpath.base);
}
console.log("Completed copying .asc files!");
})(); // <- Invoke the function immediately!
We now have a complete Node program and modules, which we can execute by running node app.js
on the command line in our os-downloads-tutorial
directory.
This may take a little time as the OS Terrain 50 dataset is 161MB. Once this is completed, there should be a new folder, working_data/asc_skye
with the .asc
files, ready to work with in QGIS. If you want to complete the tutorial and create a shaded relief map with the DEM data downloaded, find the tutorial here.
If you'd rather just download and run the code we described in the tutorial, you can clone the repository, install the npm packages and run it using the following commands:
git clone https://github.com/OrdnanceSurvey/os-data-hub-tutorials.git
cd os-data-hub-tutorials/web-development/automated-open-data-download/code
npm install
node app.js
If all goes well you'll see the console printing statements as the download and extract process starts!
Let us know if you automate a process fetching, manipulating, analysing or storing OS data - we'd love to know. Tweet at @OrdnanceSurvey and tag #OSDeveloper.