Note: this app is work in progress as of 28 Nov 2023
This app has one very specific purpose:
If you are storing backup files generated by Microsoft SQL Server in Azure storage containers (blob storage only), you can use this utility to enforce a fairly sophisticated backup retention policy. In other words, this utility will DELETE backup files from your Azure Sorage account automatically, whenever executed, according to the rules you'll define in the app settings. How to schedule its execution is up to end-users, in our case, this utility is (/was originally) deployed on an "always on" Azure Virtual Machine and executed as a scheduled task.
- As this utility is dedicated to Microsoft SQL Server backup files, it only considers files with ".bak" and ".trn" extensions. All other files are ignored.
- It allows/requires to define an individual retention policy for each individual database you are backing up to the Azure cloud. (See warning about filenames below!)
This utility implements 3 paramaters for backup retention:
RetainAllInDays
. This parameter defines a period of time from the execution time backwards, for which all SQL backup files will kept. This included ".bak" and ".trn" files, while longer retention periods will not retain (will delete) ".trn" files.WeeksRetention
. This parameter defines a period of time in number of weeks, starting from the end of the "retain all" period.MonthsRetention
. This parameter defines a period of time in number of months, starting from the end of the "retain all" period.
In broad terms, the utility will thus:
- Keep all files (".bak" and ".trn") for the "RetainAll" period.
- And also delete all ".trn" files which are older than that.
- Keep the newest file that exists for each "single week" period, starting after the "RetainAll" period, up to the number of weeks specified.
- Thus, if you specifiy 5 days for "RetainAll", and then 3 for the "WeeksRetention" at any given time, the oldest "weekly file" you may be storing will be from between (7*2) + 5 days and (7*3) + 5 days ago.
- Keep the newest file that exists for each "single month" period, starting after the "RetainAll" period, up to the number of weeks specified.
- Thus, if you specifiy 5 days for "RetainAll", and then 2 for the "MonthsRetention" at any given time, the oldest "monthly file" you may be storing will be from between one month + 5 days and two months + 5 days ago.
- If both
WeeksRetention
andMonthsRetention
are bigger than zero, the two "ranges" will overlap, meaning that the same file(s) retained because of the "weekly" setting will be retained also because of the "monthly" setting.
All parameters used by this utility are defined via the appsettings.json
file, included in this project, with sample data. Thus, in order to use this utility, you need to edit the appsettings.json
, to supply the correct parameters (including the SAS string mentioned below, preferably "url encoded").
This utility will need to enumerate the files found in your Azure Blob Storage, and then delete some of such files. Therefore, it has to obtain the correct authorisations over said storage acoount. To this aim, you need to obtain a Shared Access Signature (SAS) with (at least) Read
, List
and Delete
permissions. This SAS needs to be produced at the level of the whole Azure Storage Account, not at the level of individual containers.
You can obtain a SAS via the Azure portal, the Microsoft Azure Storage Explorer, and/or Powershell, Azure CLI, etc... Our recommendation is to use a (relatively) "long lasting" SAS with only
Read
,List
andDelete
permissions, and to guard its value with extreme care.
Moreover, this utility is designed to work for "Blob Containers" and has not been tested against any other kind of Azure storage container.
To function properly, the utility needs to write to a log file, thus you will need to ensure that the account under which the app will run has write access to the folder [app root]\LogFiles
.
This app deletes files, that's its main purpose. Deleting files is dangerous, what if it deletes the wrong ones? For this reason, this app comes with a "WhatIf" option. Set it to "true" and the app will not delete anything, just report about what it would have deleted. So, before running this utility "for real", it is highly recommended to use it in "WhatIf/simulation" mode, so to double check that it will behave as intended.
Please also note that we offer ZERO assurances that this software is bugs free! If using this app will result in unwanted consequences, the EPPI Centre and UCL accept no liabilities.
These are divided in two parts, General
and BackedUpDatabases
.
StorageURI
[Required]: this contains the https address of your blob storage account, as inhttps://[storage_account_name].blob.core.windows.net/
(please note the ending forward slash, which is required)SAS
[Required]: the URL-encoded Shared Access Signature (SAS) that provides access to said storage account.VerboseLogging
Default value istrue
: If true, the app will log to file (and console) every single thing it does, and then some. Otherwise it will log far less details.DoNotUseBlobTimestamps
Default value istrue
: See below, this is a risky option (if set tofalse
) which allows to use the timestamp of the blob files to decide what to keep and what to delete. Might produce unexpected results, as the blob timestamps refer to the blob files themselves, but the backup files might be older (might have been generated elsewhere, and uploaded to the cloud later).AsIf
Default value istrue
: enables simulation mode. If set totrue
the app will not make any changes and just report back what files would be deleted/retained. It is highly recommended to start by running this utility in this "testing mode" with VerboseLogging, so to check what it will do exactly. If you'll find that it does plan to do what you wanted, then it's OK to schedule its execution withAsIf
set tofalse
.
It's possible to ask this app to send emails automatically. This relies on having access to a suitable SMTP server which allows to authenticate via simple Username/Password authentication (which is becoming rare!). App will check if email settings look correct upon starting.
Automated emails will generally read the current log file and send its contents as the body of the email - please see also the VerboseLogging
setting, which controls how many details are logged. If errors are encountered in the early initialisation phases (before the log file is created, or upon creating the log file), then the utility stops and, if email settings are provided/enabled, it will send the error message.
The configuration values are set in the General
section, if EmailMode
[Optional] is provided and requires to send emails, then all other values are required:
EmailMode
[Optional]: allowed values are "OnError", "Always" or "Never". Putting anything else in this field will make the app behave as if "Never" was used.SMTP
: this is used to indicate the Fully Qualified Domain Name of the SMTP server to use.SMTPUser
: is the username to provide for SMTP authentication.SMTPpassword
: the relative password.mailFrom
: the value to use in the from field. It's likely you will need to use the email address of theSMTPUser
indicated above.mailTo
: who to send emails to? This can be a single email address, or a list of email addresses, separated by,
(commas).
Warning: setting up automatic emails might make you feel "safe", but it can't guarantee that you'll always be alerted if something goes wrong. For example, the password might expire, so emailing would fail, and thus, you'd get no warning about failures at all! But many other things might prevent automated emails from reaching you, so you should consider this functionality as a useful shortcut, but not as a system that allows to stop checking backups altogether.
This is a JSON array of objects, each defining the retention policy for a single database. Elements for each database are:
BackupsContainer
[Required]: the name of the blob container where your backups are stored.DatabaseName
[Required, case-insensitive]: the name of the database covered by this "policy".IsStriped
[Boolean, defaults tofalse
]: if set totrue
the app will expect that 1 backup set is comprised by multiple numbered files, where their "number" is the last number appearing in the filename, like in:DBName_backup_2023_11_24_stripe1.bak
,DBName_backup_2023_11_24_stripe2.bak
, [...].- App can work for up to 99 stripes, as it will recognise filenames that use both
stripe01
andstripe1
convention. Note that the only requirment is that the numbers which change need to be the last numbers that change (from within a single backup set) the filename doesn't need to contain the word "stripe". - On the other hand, if the filenames contain a very detailed timestamp, to the point where
[long name with date-time]file1.bak
might have a different "date-time" to the last stripe in the set[long name with later date-time]file25.bak
, then this utility will fail to identify such files as belonging to the same backup-set. This is because this utility does not read the file headers, and considers only filenames for identifying stripes.
- App can work for up to 99 stripes, as it will recognise filenames that use both
RetainAllInDays
,WeeksRetention
andMonthsRetention
[All required, all integers equal or greater than zero], please see above, these numbers define the actual retention policy for the database indicated.
- For a given "database setting", only files with a filename that begin with the database name (case-insensitive), followed by an underscore (
_
) will be processed. This means that it's safe to store backup files from multiple databases in the same container, even if the retention policy for each database is different, but only if (Warning!) your database names don't start in similar ways. For example, if you are backing up a databases called "Sales" and another one called "Sales_Archive", keeping the backups in the same container will not work. Files belonging to "Sales_Archive" will be treated twice, once for their own "policy", but also once for the policy defined for "Sales", leading to very undesirable results! - The SAS needs to be produced at the level of the whole Azure Storage Account and will NOT work if it is a SAS providing access to a single container. If you rotate the access keys, or if the SAS expires (All SASs have an expiry date), this utility will stop working.
- App will always look for a date embedded in the filename, in the format of "yyyy_MM_dd" (like
_2023_11_24_
), if that attempt fails, it will consider theDoNotUseBlobTimestamps
flag:- if
true
, files for which a date wasn't found in the filename will be always kept (by placing them in the keep-all set, these files will appear with the spoof date of "now + 1 day") - otherwise, if the
DoNotUseBlobTimestamps
flag is set to false, then theCreatedOn
Date will be used for the Blob File (which may or may not be the creation date of the backup file itself(!!), depending on how the file was produced and uploaded to blob storage).
- if
- No other date-formats in the filename are supported!
- "*.trn" files are only kept for the "retain all" period - if such files are found and their date is older than the "keep all" period, they will be deleted.
IsStriped
only applies to "*.bak" files! If you are saving transaction logs in stripes, then this utility cannot work for you (not without changing the source code).- When multiple "*.bak" files (or file-sets) are present in a weekly/monthly range, the newest file/set is kept, older files are deleted. This ensures the app can operate robustly, as it doesn't rely on backups being taken on a particular day of the week or month.
- If the App fails to create a log file upon starting, it will stop working and might also fail to send automated emails!