diff --git a/docs/reports/2024-01-03-mongodb-migration-off1.md b/docs/reports/2024-01-03-mongodb-migration-off1.md index 8db214cf..0b8f1466 100644 --- a/docs/reports/2024-01-03-mongodb-migration-off1.md +++ b/docs/reports/2024-01-03-mongodb-migration-off1.md @@ -114,7 +114,7 @@ rsync 10.0.0.3:"/etc/logrotate.d/mongo*" /zfs-hdd/pve/subvol-102-disk-0/opt/open ``` -Linked the config: +Linked the config: ```bash mv /etc/mongod.conf{,.dist} ln -s /opt/openfoodfacts-infrastructure/confs/mongodb/mongod.conf /etc/mongod.conf @@ -128,13 +128,96 @@ And check `systemctl start mongod` ### setting up cron jobs +On off3 we launched some mongodb scripts (namely `refresh_products_tags.js`) but they now disapeared from the product opener repository as we don't need them any more thanks to openfoodfacts-query project (see [openfoodfacts-server commit 90180247f](https://github.com/openfoodfacts/openfoodfacts-server/commit/90180247fe23cedcdcc32249fdb9d7b25bf6051d)) + +Still we need the product_tags collections for obf, opf and opff. + +**TODO:** +The right solution is to add mongo shell on their respective containers and call `refresh_products_tags.js` with `gen_feeds_daily_*.sh`. ## Migrating data for a test -We use rsync to get data from off3. +### trying with a simple rsync (does not works) + +We stop mongodb in mongodb container. + +We use rsync from off2 to get data from off3: + +```bash +# beware we are on zfs-nvme dataset +# we also map off and mongodb username and groupname to 100108 and 100116 +# (corresponding to mongodb user/group in mongodb container) +time rsync -a --delete-delay --usermap=mongodb:100108,off:100108 --groupmap=mongodb:100116,off:100116 10.0.0.3:/mongo/db/ /zfs-nvme/pve/subvol-102-disk-0/db/ +``` +(it took 15 min). + + +Then I restarted mongodb. + +Got an error `variable MONGODB_CONFIG_OVERRIDE_NOFORK == 1, overriding \"processManagement.fork\" to false`. + +This [thread is interesting](https://stackoverflow.com/a/76293801/2886726). +With `systemctl cat mongod` I can see that I have `MONGODB_CONFIG_OVERRIDE_NOFORK` set on new container and not on old VM. + +See also [MongoDB doc stating](https://www.mongodb.com/docs/manual/reference/configuration-options/#file-format): + +> The Linux package init scripts included in the official MongoDB packages depend on specific values for systemLog.path, storage.dbPath, and processManagement.fork. If you modify these settings in the default configuration file, mongod may not start. + +The MONGODB_CONFIG_OVERRIDE_NOFORK was introduced by https://jira.mongodb.org/browse/SERVER-74845 + +Strangely in mongod.conf we keep default value which should be false... + +Finally I [saw in the code](https://github.com/mongodb/mongo/blob/r4.4.27/src/mongo/db/server_options_server_helpers.cpp#L132) that the message was just a log / warning not an error… + +But in `/var/log/mongodb/mongod.log` I found: +``` +"msg":"ERROR: Cannot write pid file to {path_string}: {errAndStr_second}","attr":{"path_string":"/var/run/mongodb/mongod.pid","errAndStr_second":"No such file or directory"} +``` + +I removed the specific directive for pid file in mongod.conf and restarted mongodb. + +Now I can see: +```json +{"t":{"$date":"2024-01-04T16:33:33.069+00:00"},"s":"W", "c":"STORAGE", "id":22271, "ctx":"initan +dlisten","msg":"Detected unclean shutdown - Lock file is not empty","attr":{"lockFile":"/mongo/db/mo +ngod.lock"}} +{"t":{"$date":"2024-01-04T16:33:33.069+00:00"},"s":"I", "c":"STORAGE", "id":22270, "ctx":"initan +dlisten","msg":"Storage engine to use detected by data files","attr":{"dbpath":"/mongo/db","storageE +ngine":"wiredTiger"}} +{"t":{"$date":"2024-01-04T16:33:33.069+00:00"},"s":"W", "c":"STORAGE", "id":22302, "ctx":"initan +dlisten","msg":"Recovering data from the last clean checkpoint."} +... +file:WiredTigerHS.wt, hs_access: __wt_block_read_off, 286: WiredTigerHS.wt: potential hardware + corruption, read checksum error for 4096B block at offset 36864: block header checksum of 0xe91e380 +0 doesn't match expected checksum of 0xd10d51e"}} +... +lot of errors +... +``` +so thi is a problem of copy. + +Indeed it is [stated by documentation](https://www.mongodb.com/docs/manual/core/backups/#back-up-with-cp-or-rsync): +> you can copy the files directly using cp, rsync, or a similar tool. Since copying multiple files is not an atomic operation, you must stop all writes to the mongod before copying the files. + + +I'm not able to stop mongodb for just a test, so I will first setup other things like stunnel. + +## Setting up stunnel + +see [2024-01-04 Setting up stunnel](./2024-01-04-setting-up-stunnel.md) + +## Cron job for tags collection generation on opf / opff / obf + +Before we directly had a script on the crontab of mongodb server to generate tags collection. + +But it disappeared from off main version. + +We now will launch those tasks from the different servers, it is more logical and flexible. + +On each container: off, opf and opff -Then I +* install mongosh: `sudo apt install mongodb-mongosh` diff --git a/docs/reports/2024-01-04-setting-up-stunnel.md b/docs/reports/2024-01-04-setting-up-stunnel.md new file mode 100644 index 00000000..31b375bd --- /dev/null +++ b/docs/reports/2024-01-04-setting-up-stunnel.md @@ -0,0 +1,88 @@ + +## 2024-01-04 Setting up stunnel + +Robotoff and openfoodfacts-query needs access to mongodb to get data. But we want to secure access to it. + +## Setting up stunnel on off1 proxy + +On the reverse proxy container: +* installed stunnel package (which is a proxy to stunnel4) +* I had to override the systemd unit file to add RuntimeDirectory, Group and RuntimeDirectoryMode so that pid file could be added correctly by users of group stunnel4 +* created `/etc/stunnel/off.conf` - we will only have one instance for many services, no need of specific services + * run in foreground, so that systemd handles the process + * specify user and group stunnel4 + * specify pid file according to systemd unit RuntimeDirectory + * added a mongodb service, for first tests. +* created `/etc/stunnel/psk/mongodb-psk.txt` + * and made it private `chown -R go-rwx /etc/stunnel/psk/` + * To create a password I used `pwgen 32` on my laptop + +* enable and start service: + ```bash + systemctl enable stunnel@off.service + systemctl start stunnel@off.service + ``` + +All (but the psk files which are not to be committed) is part of [commit d797e7c73](https://github.com/openfoodfacts/openfoodfacts-infrastructure/commit/d797e7c7329c3c789ff21dc63ebbf1753aa4a376) + + + +Note: `dpkg-query -L stunnel4` helps me locate `/usr/share/doc/stunnel4/README.Debian` that I read to better understand the systemd working. Also `/usr/share/doc/stunnel4/examples/stunnel.conf-sample` is a good read for the global section, while configuration example with PSK is available here: https://www.stunnel.org/auth.html + +## Setting up stunnel on ovh1 proxy + +On the reverse proxy container: +* installed stunnel package (which is a proxy to stunnel4) + +* This is a older version of the package than on so systemd is not integrated, so I added systemd units myself + +* created `/etc/stunnel/off.conf` - we will only have one instance for many services, no need of specific services + added a mongodb service, for first tests. +* created `/etc/stunnel/psk/mongodb-psk.txt` + * and made it private `chown -R go-rwx /etc/stunnel/psk/` + * with the user / password created on off1 proxy +* enable and start service: + ```bash + systemctl enable stunnel@off.service + systemctl start stunnel@off.service + ``` + +All (but the psk files which are not to be committed) is part of [commit 086439230](https://github.com/openfoodfacts/openfoodfacts-infrastructure/commit/086439230a41f4d94755276610cbab838ad96f4a) + + +## Testing stunnel for mongodb + +On each server, I can use : `journalctl -f -u stunnel@off` to monitor activity.hostname() + +On off staging VM: +``` +cd /home/off/mongo-dev +sudo -u off docker-compose exec mongodb bash +mongo 10.1.0.101 +> db.hostInfo()["system"]["hostname"] +mongodb +``` + +and (after a lot of tribulations…) it worked !!! + +## Note: problem reaching 10.0.0.3 from off2 proxy (not useful right now) + +At a certain point, by mistake, I used 10.0.0.3 server for mongodb target. + +But from the proxy this is unreachable… this is because there is no route to this host. + +To add the route we can do +```bash +ip route add 10.0.0.0/24 dev eth0 proto kernel scope link src 10.1.0.101 +``` +To make it permanent we can add an executable `/etc/network/if-up.d/off-servers-route.sh` file, with: +```bash +#!/bin/bash + +if [[ $IFACE == "eth0" ]]; then + # we want to access off1 and off2 from this machine + ip route add 10.0.0.0/24 dev $IFACE proto kernel scope link src 10.1.0.101 +fi +``` + +But as right now this is not needed (new mongo is in 10.1.0.102 which is reachable), **I didn't do it**. diff --git a/docs/stunnel.md b/docs/stunnel.md new file mode 100644 index 00000000..222e4cd6 --- /dev/null +++ b/docs/stunnel.md @@ -0,0 +1,32 @@ +# Stunnel + +Stunnel enables us to secure (TCP) connection between distant servers. + +It encrypts traffic with openssl. + +It is installed on the reverse proxy of off2 and ovh1. + +Illustration: +```mermaid +sequenceDiagram + box Network 1 + Participant Cli as Client + Participant Scli as Stunnel Server 1 (client) + End + box Network 2 + Participant Sserv as Stunnel Server 2 (server) + Participant Mongo as MongoDB Server + End + Note over Scli,Sserv: The wild web + Cli->>Scli: Query MongoDB + Scli->>Sserv: Encrypted + Sserv->>Mongo: Query MongoDB + Mongo->>Sserv: MongoDB response + Sserv->>Scli: Encrypted + Scli->>Cli: MongoDB response +``` + + +## Configuration + +**FIXME**