diff --git a/docs/discourse.md b/docs/discourse.md index 76484655..a8d6f1d6 100644 --- a/docs/discourse.md +++ b/docs/discourse.md @@ -24,7 +24,7 @@ Mail is very important as a lot of notifications are sent by the forum. Mail can be tested at https://forum.openfoodfacts.org/admin/email -We use [Promox mail gateway](./mail.md). +We use [Proxomox mail gateway](./mail.md). **⚠ Warning:** the sender email [have to be on main domain](./mail.md#only-domain), NOT forum.openfoodfacts.org. diff --git a/docs/proxmox.md b/docs/proxmox.md index 582e4e74..ac8c9c95 100644 --- a/docs/proxmox.md +++ b/docs/proxmox.md @@ -79,6 +79,15 @@ We use proxmox firewall on host. **FIXME** to be completed. We have a masquerading rule for 10.1.0.1/24. +## Users and groups + +We have a minimal set of users and groups. + +*admin* group is for proxmox admins (*Administrator Role*). *ro_users* gives a read only access to the interface (*PVEAuditor* Role) + +We put users (see users in proxmox interface) in groups (see groups in proxmox interface), +and give roles to users (see permissions in proxmox interface). + ## Some Proxmox post-install thing Remove enterprise repository and add the no-subscription one @@ -245,6 +254,7 @@ Using the web interface: * Target: ovh3 * Schedule: */5 if you want every 5 minutes (takes less than 10 seconds, thanks to ZFS) +Also think about [configuring email](./mail.md#postfix-configuration) in the container ## Logging in to a container or VM diff --git a/docs/reports/2022-02-proxmox-mail-gateway-install.md b/docs/reports/2022-02-proxmox-mail-gateway-install.md index bcd6cac8..40a599d3 100644 --- a/docs/reports/2022-02-proxmox-mail-gateway-install.md +++ b/docs/reports/2022-02-proxmox-mail-gateway-install.md @@ -62,7 +62,7 @@ Prepared: ``` root@proxy:/etc/nginx/conf.d# cat pmg.openfoodfacts.org.conf - # PMG stands for Promox Mail Gateway + # PMG stands for Proxmox Mail Gateway # We need to redirect port 80, for letsencrypt's certificate management server { diff --git a/docs/reports/2023-02-17-off2-upgrade.md b/docs/reports/2023-02-17-off2-upgrade.md index 705e2f7e..16316ad9 100644 --- a/docs/reports/2023-02-17-off2-upgrade.md +++ b/docs/reports/2023-02-17-off2-upgrade.md @@ -281,7 +281,7 @@ We set some properties and rename it from off-zfs to zfs-nvme and create the zfs **EDIT:** on 2023-06-13, I re-created the zpool (it was lost in between, until we changed nvme disks). ```bash $ zpool destroy testnvme -$ zpool create -o ashift=12 testnvme mirror nvme1n1 nvme0n1 +$ zpool create -o ashift=12 zfs-nvme mirror nvme1n1 nvme0n1 $ zpool add zfs-nvme log nvme2n1 zpool status zfs-nvme pool: zfs-nvme @@ -308,7 +308,7 @@ we also receive the data from rpool2 back here: ### zfs-hdd pool -First we create partitions for those rpool. +First we create partitions for this new pool. For each sda/sdb/sdc/sdd: ```bash parted /dev/sdX mkpart zfs-hdd zfs 70g 100% diff --git a/docs/reports/2023-03-14-off2-opff-reinstall.md b/docs/reports/2023-03-14-off2-opff-reinstall.md index 54135f4f..c996951c 100644 --- a/docs/reports/2023-03-14-off2-opff-reinstall.md +++ b/docs/reports/2023-03-14-off2-opff-reinstall.md @@ -330,6 +330,7 @@ cd sanoid # checkout latest stable release or stay on master for bleeding edge stuff (but expect bugs!) git checkout $(git tag | grep "^v" | tail -n 1) ln -s packages/debian . +apt install debhelper libcapture-tiny-perl libconfig-inifiles-perl pv lzop mbuffer build-essential git dpkg-buildpackage -uc -us sudo apt install ../sanoid_*_all.deb ``` diff --git a/docs/reports/2023-12-08-off1-upgrade.md b/docs/reports/2023-12-08-off1-upgrade.md new file mode 100644 index 00000000..9ed79f2d --- /dev/null +++ b/docs/reports/2023-12-08-off1-upgrade.md @@ -0,0 +1,592 @@ +# 2023-12-18 off1 upgrade + +This is the same operation as [# 2023-02-17 Off2 Upgrade](./2023-02-17-off2-upgrade.md) +but for off1. + +We will: +* add four 14T disks +* add an adapter card for SSD +* add two 2T nvme disk and one 14G optane, while keeping existing nvme +* completely reinstall the system with Proxmox 7.4.1 + * rpool in mirror-0 for the system + * using a 70Gb part on all hdd disks + * zfs-hdd in raidz1-0 for the data + * using a 14T-70G par on all hdd disks + * zfs-nvme in raidz1-0 for data that needs to be fast + * using the two 2T nvme + * using 8G part on octane for logs + +## 2023-12-18 server physical upgrade at free datacenter + +We arrived in the morning, and go to server room, thanks to our hosts. + +### Physical changes + +We disconnected off1 cables, after taking a photo, to be sure to put ethernet cables to the right position when pluging them back. + +As off2 was above off1, we inverted their positions by moving off1 on top. + +We unplug the server. + +We removed off1 cover by pulling on the handle above the case. + +We put memories arranging them symmetrically to the existing one. + +We put HDDs in place on the front. + +We dismount the card on the back to replace it for a 4 slot cards, with our new SSDs and optane. + +We plug a monitor and a keyboard on off1 (on the rear). + +For more details, see also [Hardware in 2023-02-17 Off2 Upgrade](./2023-02-17-off2-upgrade.md#hardware) + +### Configuring BIOS + +We then close the machine and reboot to get the bios. + +#### Slot bifurcation + +This is for the SSD card. + +We go in: +* System BIOS +* Integrated devices +* Slot bifurcation +* we choose: Auto discovery of bifurcation + + +#### IDRAC / IPMI + +We go in Network settings to configure IDRAC / IPMI (which has its own ethernet card): +* disable DHCP and auto-discovery +* Static IP address: 213.36.253.207 +* Gateway: 213.36.253.222 +* Subnet Mask: 255.255.255.224 +* Static Prefered DNS: 213.36.253.10 +* Static Alter: 213.36.252.131 + + +#### PERC adapter + +We go reboot and go again in the BIOS to configure PERC Adapter Bios (Power Edge RAID Controller), and change configuration to be in HBA mode for disks. + + +We had some problem setting up the disks, because they were part of a RAID. + +First boot we saw only 3 disks, but after a reboot we saw 4. + +In main menu, physical disk management, we use "Convert to non RAID Disks" for both disks. + +![Convert to non RAID Disk](media/2023-12-18-off1-bios-disk-remove-raid.jpg "Bios screen showing convert to non RAID Disks") + +As we start we still have an error: +![Error on startup](media/2023-12-18-off1-bios-UEFI-error.jpg "Screen showing UEFI Error at startup") + +Indeed at boot we see there is One virtual drive and two non raid disk (we should have 4). +![Virtual disks on startup screen](media/2023-12-18-off1-bios-virtual-disk-on-startup-screen.jpg) + +We had to delete this virtual disk in the BIOS, in virtual disks management. + +![Delete virtual disk](media/2023-12-18-off1-bios-remove-virtual-disk.jpg) + +We then put the disks in HBA mode (in control management) + +### Firmware updates first try + +At this point we were stuck with the same error as before. It suggested to update the PCIe firmware, so we decided to try this. + +F10 (lifecycle controller) brings us to a screen for that. + +After a long consultation of HP website, we get to the lifecycle wizard. + +We get to Firmware updates, lifecycle manager, launch firmware update. + +* At step 3 out of 5 we had to configure the Network. We did the error to not use the right address several time, at the end, the idea is to use the public IP of the server (and neither the private, nor the IPMI one) +* we choose FTP Server but it was not the good one, neither was HTTPS server responding + +We also tried to use a program through USB stick, but it did not work. + +![Firmware update through USB stick failure](media/2023-12-18-off1-firmware-update-usb-stick.jpg) + +We saw that disabling SSD card we were able to boot. + +We decided to disable the SSD card and continue installation. + +### Installing Proxmox + +We boot on the USB key that we had brought with promox install CD on it. + +For Hard disk options, we choose ZFS - RAID1, ashift 12, compress an checksum on, disk size 64G + +![ZFS RAID1 basic setup for system partition](media/2023-12-18-off1-proxmox-raid1-disks-setup.jpg) + +![ZFS RAID1 advanced setup for system](media/2023-12-18-off1-proxmox-raid1-setup.jpg) + +We choose a good password for root. + +We configure eno1 network + +![Network setup](media/2023-12-18-off1-proxmox-network-setup.jpg) + +![Install summary screen](media/2023-12-18-off1-proxmox-install-summary.jpg) + +### Firmware updates again + +After install and adding openssh-server, we were being able to log as root. + +We found a USB boot disk for HP servers. We mounted it on the server. + +We tried to use `suu --import-public-keys` it does something. + +We where able to upgrade the lifecycle controller, but not the bios. + +After a lot of trying, where addresses provided on HP website to upgrade were not working, Stéphane finally found a forum post that gave us the right URL to put in lifecycle manager to get updates. + +We used 143.166.28.19 in place of downloads.dell.com and it worked ! + +![Bios is upgrading (tears of nervous joy)](media/2023-12-18-off1-bios-upgrade.jpg) + +We finally get BIOS 2.19.1. + +### SSDs again + +Even after upgrading BIOS, SSD is not working. By removing SSDs one by one, we found the culprit. +So we mount back all SSDs but one. + +When we restarted the server, we got an error, because there is a conflict with existing old rpool (which was on SSD). + +We resolved this by force importing the right pool using it's ID (that we get with `zpool status`) and destroying the other. (or something like that !)... + +## 2023-12-21 continuing server install + +### First ssh connexion + +Using root: `ssh root@off1 -o PubkeyAuthentication=no` + +### Update and add some base packages + +Change `/etc/apt/sources.list.d/pve-enterprise.list` for `/etc/apt/sources.list.d/pve-install-repo.list`, +with inside: + +```bash +apt update +apt upgrade +apt install munin-node sudo vim parted tree etckeeper rsync screen fail2ban git curl htop lsb-release bsd-mailx +``` + +### Configure locale + +```bash +vim /etc/locale.gen +# uncomment fr_FR.UTF-8, and exit +locale-gen +``` + +### Network configuration + +In `/etc/network/interfaces` added the `vmbr1` bridge interface (on eno2): +```conf +auto vmbr1 +iface vmbr1 inet static + address 10.0.0.1/8 + bridge-ports eno2 + bridge-stp off + bridge-fd 0 + post-up echo 1 > /proc/sys/net/ipv4/ip_forward + post-up iptables -t nat -A POSTROUTING -s '10.1.0.0/16' -o vmbr0 -j MASQUERADE + post-down iptables -t nat -D POSTROUTING -s '10.1.0.0/16' -o vmbr0 -j MASQUERADE +``` + +Then `systemctl restart networking` + +And verify with `ip address list` and `ip route list` +``` +default via 213.36.253.222 dev vmbr0 proto kernel onlink +10.0.0.0/8 dev vmbr1 proto kernel scope link src 10.0.0.1 +213.36.253.192/27 dev vmbr0 proto kernel scope link src 213.36.253.206 +``` + +### Creating users + + +First created off user (to ensure it have 1000 id): + +```bash +adduser --shell /usr/sbin/nologin off +``` + +Then add other sudo users, like: +```bash +adduser alex +... +adduser alex sudo +``` +and copy ssh keys. + +I also copied password hash for off2 to off1 (user can then decide to change their password altogether). + +### Creating ZFS pools + +We already have rpool where the distribution is installed. + + +#### Partition disks + + +First we need to create partitions on our four HDD to be part of `zfs-hdd`. +They already have a partition for the system (participating in `rpool`) + +Running: +```bash +for name in a b c d; do parted /dev/sd$name print; done +``` +show us the same pattern for all disks: +``` +Model: ATA TOSHIBA MG07ACA1 (scsi) +Disk /dev/sdd: 14,0TB +Sector size (logical/physical): 512B/4096B +Partition Table: gpt +Disk Flags: +Number Start End Size File system Name Flags + 1 17,4kB 1049kB 1031kB bios_grub + 2 1049kB 1075MB 1074MB fat32 boot, esp + 3 1075MB 69,8GB 68,7GB zfs +``` + +We can print sectors to know where to start next partition more precisely: + +```bash +for name in a b c d; do parted /dev/sd$name 'unit s print'; done + +... +Number Start End Size File system Name Flags +... + 3 2099200s 136314880s 134215681s zfs + +``` + +We add 2048 to 136314880 to be better aligned. + +So now we create the partitions on the remaining space: +```bash +for name in a b c d; do parted /dev/sd$name mkpart zfs-hdd zfs 136316928s 100%; done +``` + +We also want to partition the SSDs and octane. + +Listing them to know which is which: +```bash +for device in /dev/nvme?;do echo $device "--------";smartctl -a $device|grep -P '(Model|Size/Capacity)';done + +/dev/nvme0 -------- +Model Number: WD_BLACK SN770 2TB +Namespace 1 Size/Capacity: 2 000 398 934 016 [2,00 TB] +/dev/nvme1 -------- +Model Number: INTEL MEMPEK1J016GA +Namespace 1 Size/Capacity: 14 403 239 936 [14,4 GB] +/dev/nvme2 -------- +Model Number: Samsung SSD 970 EVO Plus 1TB +Namespace 1 Size/Capacity: 1 000 204 886 016 [1,00 TB] +``` +(at the time of install, one 2TB SSD is missing because it was impossible to boot with it). + +So: +* /dev/nvme0 is the 2TB SSD +* /dev/nvme1 is the octane (14,4 GB) +* /dev/nvme2 is the (old) 1TB SSD + +I follow the partitioning of off2: + +```bash +# 2TB SSD is devoted entirely to zfs-nvme +parted /dev/nvme0n1 mklabel gpt +parted /dev/nvme0n1 mkpart zfs-nvme zfs 2048s 100% + +# Octane is divided as log for zfs-hdd and zfs-nvme +parted /dev/nvme1n1 mklabel gpt +parted /dev/nvme1n1 \ + mkpart log-zfs-nvme zfs 2048s 50% \ + mkpart log-zfs-hdd zfs 50% 100% + +# 1TB SSD will be cache for zfs-hdd +parted /dev/nvme2n1 mklabel gpt +# we need to have a xfs partition, I don't know exactly why ! +# but without it, the zfs partition is changed to a xfs one… +parted /dev/nvme2n1 \ + mkpart xfs 2048s 64G \ + mkpart zfs-hdd-cache zfs 64G 100% +``` + +We can see all our partition on the disk: +```bash +ls /dev/sd?? /dev/nvme?n1p? + +lsblk +``` + +#### Creating zfs pools + +We creates a zfs-hdd pool with partitions mounted as zraid1 and (sda4, sdb4, sdc4 and sdd4) and a partition on the octane disk as log and some properties. + +```bash +zpool create zfs-hdd -o ashift=12 raidz1 sda4 sdb4 sdc4 sdd4 +zfs set compress=on xattr=sa atime=off zfs-hdd +zpool add zfs-hdd log nvme1n1p2 +zpool add zfs-hdd cache nvme2n1p2 +``` + +Note: +Doing the later I got: `/dev/nvme1n1p2 is part of potentially active pool 'zfs-hdd'` +This is because octane was used in a previous install on off2. +I just did: `zpool labelclear -f nvme1n1`, `zpool labelclear -f nvme1n1p2` and `zpool labelclear -f nvme2n1p2` +(for real, tried to clear the label on any not yet used partition). + +We creates a zfs-nvme pool but with only nvme0n1p1, as it's the only SSD and a partition on the octane disk as log and some properties. It can't be a mirror yet because there is only one device… we will have to backup, destroy and re-create it with new nvme to be able to have a mirror. + +```bash +zpool create zfs-nvme -o ashift=12 nvme0n1p1 +zfs set compress=on xattr=sa atime=off zfs-nvme +zpool add zfs-nvme log nvme1n1p1 +``` + +## Joining PVE cluster + +It's time to join the cluster. We will join it using internal ip ! + + +### preparing /etc/hosts + +I edited /etc/hosts on off1 to have: + +```conf +127.0.0.1 localhost.localdomain localhost +# 213.36.253.206 off1.openfoodfacts.org off1 +10.0.0.1 off1.openfoodfacts.org off1 pve-localhost +10.0.0.2 off2.openfoodfacts.org off2 +... +``` + +And on off2: +```conf +127.0.0.1 localhost.localdomain localhost +10.0.0.2 off2.openfoodfacts.org off2 pve-localhost +#213.36.253.208 off2.openfoodfacts.org off2 +10.0.0.1 off1.openfoodfacts.org off1 +... +``` + +### creating the cluster on off2 + +See [official docs](https://pve.proxmox.com/pve-docs-6/chapter-pvecm.html) + +On off2: +```bash +pvecm create off-free +pvecm status +``` + +### joining the cluster from off1 + +On off1 +```bash +pvecm add 10.0.0.2 --fingerprint "43:B6:2A:DC:BF:17:C8:70:8F:3C:A4:A8:2D:D5:F8:24:18:6B:78:6D:24:8A:65:DA:71:04:A3:FE:E0:45:DE:B6" +``` + +Note: first time I did it without `--fingerprint` option. +I verified the fingerprint by looking at the certificate of proxmox manager in firefox. + +### Using systemd-timesyncd + +As it's [proposed by Proxmox guide](https://pve.proxmox.com/pve-docs-6/pve-admin-guide.html#_time_synchronization) +I installed `systemd-timesyncd` on off1 and off2. + + +### adding the storages + +We create pve pools: + +```bash +zfs create zfs-hdd/pve +zfs create zfs-nvme/pve +``` +They are immediatly availabe in proxmox ! + +``` +pvesm status +Name Type Status Total Used Available % +backups dir active 39396965504 256 39396965248 0.00% +local dir active 64475008 11079168 53395840 17.18% +zfs-hdd zfspool active 39396965446 139 39396965307 0.00% +zfs-nvme zfspool active 1885863288 96 1885863192 0.00% +``` + +Also the backups dir automatically get on zfs-hdd, but I don't really know why ! +`cat /etc/pve/storage.cfg` helps see that. + +I still have to create it: +```bash +zfs create zfs-hdd/backups +``` + +## Adding ro_users group to proxmox + +We have a `ro_users` group for users to have read-only access to the proxmox interface. + +I created the group, going to cluster interface in proxmox and using `create`. + +I then go to `permissions` and give the PVEAuditor role to the group on `/` with *propagate*. + +![off2 proxmox group roles](./media/2024-01-proxmox-group-roles-off2.png "off2 proxmox group roles") + +## Getting containers templates + +See [proxmox docs on container images](https://pve.proxmox.com/wiki/Linux_Container#pct_container_images) + +```bash +pveam update +pveam available|grep 'debian-.*-standard' +pveam download local debian-11-standard_11.7-1_amd64.tar.zst +pveam download local debian-12-standard_12.2-1_amd64.tar.zst +``` + +## Adding openfoodfacts-infrastructure repository + +Added root ssh pub key (`cat /root/.ssh/id_rsa.pub`) as a [deploy key to github infrastructure repository](https://github.com/openfoodfacts/openfoodfacts-infrastructure/settings/keys) + + + +```bash +cd /opt +git clone git@github.com:openfoodfacts/openfoodfacts-infrastructure.git +``` + +## Adding Munin monitoring + +Simply following [our Munin doc on how to configure a server](../munin.md#how-to-configure-a-server) + +## Configuring snapshots and syncoid + +I first installed sanoid following [install instructions](../sanoid.md#building-sanoid-deb) + +We want to pull snapshots from off1 and to let ovh3 pull our snapshots. + + +### Enabling sanoid + +```bash +for unit in email-failures@.service sanoid_check.service sanoid_check.timer sanoid.service.d; \ + do ln -s /opt/openfoodfacts-infrastructure/confs/off1/systemd/system/$unit /etc/systemd/system ; \ +done +systemctl daemon-reload +systemctl enable --now sanoid_check.timer +systemctl enable --now sanoid.service +``` + +### sync from off2 to off1 + +#### creating off1operator on off2 + +Similar as off2operator on off1 + +```bash +adduser off1operator +... add ssh pub key ... + +zfs allow off1operator hold,send zfs-hdd +zfs allow off1operator hold,send zfs-nvme +zfs allow off2operator hold,send rpool +``` + +On off1, test ssh connection +```bash +ssh off1operator@10.0.0.2 +``` + +#### Doing first sync + +Create conf for syncoid on off1 + +```bash +ln -s /opt/openfoodfacts-infrastructure/confs/off1/sanoid/syncoid-args.conf /etc/sanoid/ +``` + +Do first sync by hand in a screen (because it will take a very long time): + +```bash +set -x; \ +grep -v "^#" /etc/sanoid/syncoid-args.conf | \ +while read -a sync_args; \ +do syncoid "${sync_args[@]}" The Linux package init scripts included in the official MongoDB packages depend on specific values for systemLog.path, storage.dbPath, and processManagement.fork. If you modify these settings in the default configuration file, mongod may not start. + +The MONGODB_CONFIG_OVERRIDE_NOFORK was introduced by https://jira.mongodb.org/browse/SERVER-74845 + +Strangely in mongod.conf we keep default value which should be false... + +Finally I [saw in the code](https://github.com/mongodb/mongo/blob/r4.4.27/src/mongo/db/server_options_server_helpers.cpp#L132) that the message was just a log / warning not an error… + +But in `/var/log/mongodb/mongod.log` I found: +``` +"msg":"ERROR: Cannot write pid file to {path_string}: {errAndStr_second}","attr":{"path_string":"/var/run/mongodb/mongod.pid","errAndStr_second":"No such file or directory"} +``` + +I removed the specific directive for pid file in mongod.conf and restarted mongodb. + +Now I can see: +```json +{"t":{"$date":"2024-01-04T16:33:33.069+00:00"},"s":"W", "c":"STORAGE", "id":22271, "ctx":"initan +dlisten","msg":"Detected unclean shutdown - Lock file is not empty","attr":{"lockFile":"/mongo/db/mo +ngod.lock"}} +{"t":{"$date":"2024-01-04T16:33:33.069+00:00"},"s":"I", "c":"STORAGE", "id":22270, "ctx":"initan +dlisten","msg":"Storage engine to use detected by data files","attr":{"dbpath":"/mongo/db","storageE +ngine":"wiredTiger"}} +{"t":{"$date":"2024-01-04T16:33:33.069+00:00"},"s":"W", "c":"STORAGE", "id":22302, "ctx":"initan +dlisten","msg":"Recovering data from the last clean checkpoint."} +... +file:WiredTigerHS.wt, hs_access: __wt_block_read_off, 286: WiredTigerHS.wt: potential hardware + corruption, read checksum error for 4096B block at offset 36864: block header checksum of 0xe91e380 +0 doesn't match expected checksum of 0xd10d51e"}} +... +lot of errors +... +``` +so thi is a problem of copy. + +Indeed it is [stated by documentation](https://www.mongodb.com/docs/manual/core/backups/#back-up-with-cp-or-rsync): +> you can copy the files directly using cp, rsync, or a similar tool. Since copying multiple files is not an atomic operation, you must stop all writes to the mongod before copying the files. + + +I'm not able to stop mongodb for just a test, so I will first setup other things like stunnel. + +## Setting up stunnel + +see [2024-01-04 Setting up stunnel](./2024-01-04-setting-up-stunnel.md) + +## Cron job for tags collection generation on opf / opff / obf + +Before we directly had a script on the crontab of mongodb server to generate tags collection. + +But it disappeared from off main version. + +We now will launch those tasks from the different servers, it is more logical and flexible. + +On each container: off, opf and opff + +* install mongosh: `sudo apt install mongodb-mongosh` + +## Migration procedure + +* stop mongodb on mongodb container: `systemctl stop mongodb` +* rsync MongoDB data from off3 to mongodb container. On off1, as root: + ```bash + time rsync -a --delete-delay --usermap=mongodb:100108,off:100108 --groupmap=mongodb:100116,off:100116 10.0.0.3:/mongo/db/ /zfs-nvme/pve/subvol-102-disk-0/db/ + ``` + (took 15 min 56s) +* warn slack users +* stop mongodb on off3: `systemctl stop mongod` +* rsync again (same command as above) (took 6 min 42s) +* (during sync) + * change configuration for product opener + * on off, off-pro + * edit /srv/$HOSTNAME/lib/ProductOpener/Config2.pm + * `sudo systemctl restart apache2.service cloud_vision_ocr@$HOSTNAME.service minion@HOSTNAME.service` + * on opf, opff, obf: + * edit /srv/$PROJECT/lib/ProductOpener/Config2.pm + * `sudo systemctl restart apache2.service` + * change configuration for off-query-org and robotoff-org to point to 10.1.0.101:27017 (ovh1 proxy): + * going to corresponding directory + * editing .env + * `sudo -u off docker-compose down && sudo -u off docker-compose up -d` +* start mongodb on mongodb container (keep off3 down) +* verify it's working on off +* disable mongod on off3 +* merge PR to change mongodb configuration of off-query and robotoff \ No newline at end of file diff --git a/docs/reports/2024-01-04-setting-up-stunnel.md b/docs/reports/2024-01-04-setting-up-stunnel.md new file mode 100644 index 00000000..c182ad26 --- /dev/null +++ b/docs/reports/2024-01-04-setting-up-stunnel.md @@ -0,0 +1,147 @@ + +## 2024-01-04 Setting up stunnel + +Robotoff and openfoodfacts-query needs access to mongodb to get data. But we want to secure access to it. + +## Setting up stunnel on off1 proxy + +On the reverse proxy container: +* installed stunnel package (which is a proxy to stunnel4) +* I had to override the systemd unit file to add RuntimeDirectory, Group and RuntimeDirectoryMode so that pid file could be added correctly by users of group stunnel4 +* created `/etc/stunnel/off.conf` - we will only have one instance for many services, no need of specific services + * run in foreground, so that systemd handles the process + * specify user and group stunnel4 + * specify pid file according to systemd unit RuntimeDirectory + * added a mongodb service, for first tests. +* created `/etc/stunnel/psk/mongodb-psk.txt` + * and made it private `chmod -R go-rwx /etc/stunnel/psk/` + * To create a password I used `pwgen 32` on my laptop + +* enable and start service: + ```bash + systemctl enable stunnel@off.service + systemctl start stunnel@off.service + ``` + +All (but the psk files which are not to be committed) is part of [commit d797e7c73](https://github.com/openfoodfacts/openfoodfacts-infrastructure/commit/d797e7c7329c3c789ff21dc63ebbf1753aa4a376) + + + +Note: `dpkg-query -L stunnel4` helps me locate `/usr/share/doc/stunnel4/README.Debian` that I read to better understand the systemd working. Also `/usr/share/doc/stunnel4/examples/stunnel.conf-sample` is a good read for the global section, while configuration example with PSK is available here: https://www.stunnel.org/auth.html + +## Setting up stunnel on ovh1 proxy + +On the reverse proxy container: +* installed stunnel package (which is a proxy to stunnel4) + +* This is a older version of the package than on so systemd is not integrated, so I added systemd units myself + +* created `/etc/stunnel/off.conf` - we will only have one instance for many services, no need of specific services + added a mongodb service, for first tests. +* created `/etc/stunnel/psk/mongodb-psk.txt` + * and made it private `chown -R go-rwx /etc/stunnel/psk/` + * with the user / password created on off1 proxy +* enable and start service: + ```bash + systemctl enable stunnel@off.service + systemctl start stunnel@off.service + ``` + +All (but the psk files which are not to be committed) is part of [commit 086439230](https://github.com/openfoodfacts/openfoodfacts-infrastructure/commit/086439230a41f4d94755276610cbab838ad96f4a) + + +## Testing stunnel for mongodb + +On each server, I can use : `journalctl -f -u stunnel@off` to monitor activity.hostname() + +On off staging VM: +``` +cd /home/off/mongo-dev +sudo -u off docker-compose exec mongodb bash +mongo 10.1.0.101 +> db.hostInfo()["system"]["hostname"] +mongodb +``` + +and (after a lot of tribulations…) it worked !!! + +## Note: problem reaching 10.0.0.3 from off2 proxy (not useful right now) + +At a certain point, by mistake, I used 10.0.0.3 server for mongodb target. + +But from the proxy this is unreachable… this is because there is no route to this host. + +To add the route we can do +```bash +ip route add 10.0.0.0/24 dev eth0 proto kernel scope link src 10.1.0.101 +``` +To make it permanent we can add an executable `/etc/network/if-up.d/off-servers-route.sh` file, with: +```bash +#!/bin/bash + +if [[ $IFACE == "eth0" ]]; then + # we want to access off1 and off2 from this machine + ip route add 10.0.0.0/24 dev $IFACE proto kernel scope link src 10.1.0.101 +fi +``` + +But as right now this is not needed (new mongo is in 10.1.0.102 which is reachable), **I didn't do it**. + +## 2023-01-08 MongoDB get hacked ! + +I did change the configuration for the stunnel entrance not to be exposed on public IP, but it seems it was not taken into account (maybe I did not restart stunnel service correctly)… and thus our MongoDB stunnel access was expose to the wild web… where some hacker immediately take our database and drop it to ask for money against retrieval… + +Luckily Gala noticed rapidly and Stephane identified that mongo was exposed through our proxy1 ip address. + +We have the data in the sto, so it's not the end of the world but still it's very annoying. +Unfortunately I did not already setup auto snapshotting (because I was seeing mongodb data as transient) + +I rsync data from off3 again (dating 3h before) and lose updates to the mongodb for 3h but got the mongodb up again quickly. + +But I took the decision: + +* to move client stunnel to a separate container with no risk of exposition +* to snapshot mongodb data because restoring from sto would take long so it's a big annoyance + + +## Creating stunnel client container + +We followed usual procedure to [create a proxmox container](../proxmox.md#how-to-create-a-new-container): +* of id 113 +* choosed a debian 11 +* default storage on zfs-hdd (for system) 6Gb, noatime +* 2 cores +* memory 512 Mb, no swap + +I also [configured email](../mail.md#postfix-configuration) in the container. + + +## Setting up stunnel on ovh1 stunnel-client + +Did the same as above to [set up stunnel on ovh1 proxy](./#setting-up-stunnel-on-off1-proxy). + +I created a key with `ssh-keygen -t ed25519 -C "off@stunnel-client.ovh.openfoodfacts.org"` +add it as a deploy key to this projects +and cloned the project in `/opt` so that I can use git for modified configuration files. + +I created my configs and symlinked them. +Then: +```bash +systemctl daemon-reload +systemctl start stunnel@off +systemctl enable stunnel@off +``` + +I tested it from staging mongo container (see [Testing stunnel for mongodb](#testing-stunnel-for-mongodb)) + + +## Changing services config + +On VM docker-prod (200), I changed the .env for off-query-org and robotoff-org. +Then for both services I did a "docker-compose down && docker-compose up -d". + +I also pushed a [commit to robotoff](https://github.com/openfoodfacts/robotoff/commit/ade67c21bab152afe64c33b9f540bf91b212efb0) and a [PR to off-query](https://github.com/openfoodfacts/openfoodfacts-query/pull/32) to change the configuration. + +## Removing stunnel client on ovh reverse proxy + +On the reverse proxy I kept stunnel but I removed the config for MongoDB. diff --git a/docs/reports/media/2023-12-18-off1-bios-UEFI-error.jpg b/docs/reports/media/2023-12-18-off1-bios-UEFI-error.jpg new file mode 100644 index 00000000..fbbd74bc Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-bios-UEFI-error.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-bios-disk-remove-raid.jpg b/docs/reports/media/2023-12-18-off1-bios-disk-remove-raid.jpg new file mode 100644 index 00000000..1f5b5b15 Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-bios-disk-remove-raid.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-bios-remove-virtual-disk.jpg b/docs/reports/media/2023-12-18-off1-bios-remove-virtual-disk.jpg new file mode 100644 index 00000000..d326654d Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-bios-remove-virtual-disk.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-bios-upgrade.jpg b/docs/reports/media/2023-12-18-off1-bios-upgrade.jpg new file mode 100644 index 00000000..0a36ac0c Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-bios-upgrade.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-bios-virtual-disk-on-startup-screen.jpg b/docs/reports/media/2023-12-18-off1-bios-virtual-disk-on-startup-screen.jpg new file mode 100644 index 00000000..f1161a97 Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-bios-virtual-disk-on-startup-screen.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-firmware-update-usb-stick.jpg b/docs/reports/media/2023-12-18-off1-firmware-update-usb-stick.jpg new file mode 100644 index 00000000..1bedab51 Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-firmware-update-usb-stick.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-proxmox-install-summary.jpg b/docs/reports/media/2023-12-18-off1-proxmox-install-summary.jpg new file mode 100644 index 00000000..9b928c37 Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-proxmox-install-summary.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-proxmox-network-setup.jpg b/docs/reports/media/2023-12-18-off1-proxmox-network-setup.jpg new file mode 100644 index 00000000..bd188e53 Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-proxmox-network-setup.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-proxmox-raid1-disks-setup.jpg b/docs/reports/media/2023-12-18-off1-proxmox-raid1-disks-setup.jpg new file mode 100644 index 00000000..b5529906 Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-proxmox-raid1-disks-setup.jpg differ diff --git a/docs/reports/media/2023-12-18-off1-proxmox-raid1-setup.jpg b/docs/reports/media/2023-12-18-off1-proxmox-raid1-setup.jpg new file mode 100644 index 00000000..6c826d50 Binary files /dev/null and b/docs/reports/media/2023-12-18-off1-proxmox-raid1-setup.jpg differ diff --git a/docs/reports/media/2024-01-proxmox-group-roles-off2.png b/docs/reports/media/2024-01-proxmox-group-roles-off2.png new file mode 100644 index 00000000..6ac41901 Binary files /dev/null and b/docs/reports/media/2024-01-proxmox-group-roles-off2.png differ diff --git a/docs/sanoid.md b/docs/sanoid.md index bc5194cf..1bbfe318 100644 --- a/docs/sanoid.md +++ b/docs/sanoid.md @@ -2,39 +2,63 @@ We use [Sanoid](https://github.com/jimsalterjrs/sanoid/) to: - automatically take regular snapshots of ZFS Datasets -- automatically clean snapshots thanks to a retention policy +- automatically clean snapshots according to a retention policy - sync datasets between servers thanks to the `syncoid` command -## snapshot configuration +## sanoid snapshot configuration `/etc/sanoid/sanoid.conf` contains the configuration for sanoid snapshots. That is how frequently you want to do them, and the retention policy (how much to keep snapshots). See [reference documentation](https://github.com/jimsalterjrs/sanoid/wiki/Sanoid) -There are generally two templates: +There are generally two kind of templates: - one for datasets that are synced from a different server. In this case we don't want to create snapshots as we already receive the one from source. We only want to purge old snapshots. - one for datasets where the source is this server. In this case we want to regularly create snapshots and purge old ones. +We then have different retention strategies based on the type of data. + +## sanoid checks + +We have a timer/service sanoid_check that checks that we have recent snapshots for datasets. +This is useful to verify sanoid is running, or syncoid is doing it's job. + +The default is to check every ZFS datasets, but the one you list with `no_sanoid_checks:` +in the comments of your `sanoid.conf` file. +You can put more than one dataset per line, by separating them with ":". + +For example: +```conf +# no_sanoid_checks:rpool/logs-nginx: +# no_sanoid_checks:rpool/obf-old:rpool/opf-old: +``` + ## syncoid service and configuration -Sanoid does not come with a systemd service, so we created one, see: `confs/off2/systemd/system/syncoid.service` +Sanoid does not come with a systemd service for syncoid, +so we created one, see: `confs/common/systemd/system/syncoid.service` The syncoid service can synchronize *to* or *from* a server. +But it is always preferred to be in pull mode. +The idea is to avoid having elevated privileges on the distant server. So if an attacker gains privilege access on one server, it can't gain access to the other server (and eg. remove or encrypt all data, including backups). +* We use a user named operator (eg. off2operator) on the remote server we want to pull from +* We use [zfs allow command](https://openzfs.github.io/openzfs-docs/man/8/zfs-allow.8.html) to give the `hold,send` permissions to this user + The service simply use each line of `/etc/sanoid/syncoid-args.conf` as arguments to `syncoid` command. + ## getting status You can use : `systemctl status sanoid.service` and `systemctl status syncoid.service` to see the logs of last synchronization. Also you can list snapshot on source / destination ZFS datasets to see if there are recent ones: -`sudo zfs list -t snap /` +`/usr/sbin/zfs list -t snap /` ## Install @@ -44,3 +68,109 @@ It provides a sanoid systemd service and a timer unit that just have to be enabl For syncoid to be launched by systemd, we created a service ([see syncoid service and configuration](#syncoid-service-and-configuration)). This service is declared as a dependency of the sanoid so that it runs just after it. + +### How to build and install sanoid deb + +See [install documentation](https://github.com/jimsalterjrs/sanoid/blob/master/INSTALL.md#debianubuntu). +I exactly follow the instructions. + +```bash +cd /opt +git clone https://github.com/jimsalterjrs/sanoid.git +cd sanoid +# checkout latest stable release (or stay on master for bleeding edge stuff, but expect bugs!) +git checkout $(git tag | grep "^v" | tail -n 1) +ln -s packages/debian . +apt install debhelper libcapture-tiny-perl libconfig-inifiles-perl pv lzop mbuffer build-essential git +dpkg-buildpackage -uc -us +sudo apt install ../sanoid_*_all.deb +``` + +then [enable sanoid service](#how-to-enable-sanoid-service) + +### How to enable sanoid service + +Create conf for sanoid and link it + +```bash +ln -s /opt/openfoodfacts-infrastructure/confs/$SERVER_NAME/sanoid/sanoid.conf /etc/sanoid/ +``` + + +```bash +for unit in email-failures@.service sanoid_check.service sanoid_check.timer sanoid.service.d; \ + do ln -s /opt/openfoodfacts-infrastructure/confs/off1/systemd/system/$unit /etc/systemd/system ; \ +done +systemctl daemon-reload +systemctl enable --now sanoid_check.timer +systemctl enable --now sanoid.service +``` + + +### How to enable syncoid service + +Create conf for syncoid and link it + +```bash +ln -s /opt/openfoodfacts-infrastructure/confs/$SERVER_NAME/sanoid/syncoid-args.conf /etc/sanoid/ +``` + +Enable syncoid service: +```bash +ln -s /opt/openfoodfacts-infrastructure/confs/$SERVER_NAME/systemd/system/syncoid.service /etc/systemd/system +systemctl daemon-reload +systemctl enable --now syncoid.service +``` + +### How to setup synchronization without using root + +Say we want to pull data from zfs-hdd, zfs-nvme and rpool for PROD_SERVER to BACKUP_SERVER + +#### creating operator on PROD_SERVER + +```bash +OPERATOR=${BACKUP_SERVER}operator +adduser $OPERATOR +# choose a random password (pwgen 16 16) and discard it + +# copy public key +mkdir /home/$OPERATOR/.ssh +vim /home/$OPERATOR/.ssh/authorized_keys +# copy BACKUP_SERVER root public key + +chown -R /home/$OPERATOR +chmod go-rwx -R /home/$OPERATOR/.ssh +``` + +Adding needed permissions to pull zfs syncs +```bash +zfs allow $OPERATOR hold,send zfs-hdd +zfs allow $OPERATOR hold,send zfs-nvme +zfs allow $OPERATOR hold,send rpool + +``` +#### test connection on BACKUP_SERVER + +On BACKUP_SERVER, test ssh connection: + +```bash +OPERATOR=${BACKUP_SERVER}operator +ssh $OPERATOR@ +``` + +#### config syncoid + +You have sanoid running on the $PROD_SERVER, and creating snapshot for the dataset you want to backup remotely. + +You have sanoid and syncoid already configured on BACKUP_SERVER. + +We can now add lines to `syncoid-args.conf`, on BACKUP_SERVER +they must use the `--no-privilege-elevation` and `--no-sync-snap` options +(if you want to create a sync snap, +you will have to also grand snapshot creation to $OPERATOR user on $PROD_SERVER). + +Use `--recursive` to also backup subdatasets. + +Don't forget to create a sane retention policy (with `autosnap=no`) in sanoid on $BACKUP_SERVER to remove old data. + +**Note:** because of the 6h timeout, if you have big datasets, you may want to do the first synchronization before enabling the service. \ No newline at end of file diff --git a/docs/stunnel.md b/docs/stunnel.md index e69de29b..24a08c06 100644 --- a/docs/stunnel.md +++ b/docs/stunnel.md @@ -0,0 +1,90 @@ +# Stunnel + +Stunnel enables us to secure (TCP) connection between distant servers. + +It encrypts traffic with openssl. + +It is installed on the reverse proxy of off2 and ovh1. + +Illustration: +```mermaid +sequenceDiagram + box Private Network 1 + Participant Cli as Client + Participant Scli as Stunnel Server 1 (client) + End + box Private Network 2 + Participant Sserv as Stunnel Server 2 (server) + Participant Mongo as MongoDB Server + End + Note over Scli,Sserv: The wild web + Cli->>Scli: Query MongoDB + Scli->>Sserv: Encrypted + Sserv->>Mongo: Query MongoDB + Mongo->>Sserv: MongoDB response + Sserv->>Scli: Encrypted + Scli->>Cli: MongoDB response +``` +## Client vs Server + +When we configure stunnel, there are two side: +* the side, where connection happens (*Stunnel Server 1* on the diagram) is called **client side** +* the other side accepts connections only from other stunnel instances and forward them to the exposed service, it is the **server side** + +**VERY IMPORTANT** + +Whereas the **server side** port needs to be exposed publicly on the web, we want to avoid **client side** to be exposed on a public IP even inadvertently. + +That's why we: +* use **reverse proxy** for stunnel **server** services, and only those +* use a **specific internal container** for stunnel **client** entrypoints, with only a private IP exposed. + + +## Configuration + +### Location and server/client + +Configuration is in `/etc/stunnel/off.conf` this is the `off` instance (and meant to be the only one for now), +we can add as many entries to serve different services as we want. + +When you configure stunnel, on the side, where connection happens (*Stunnel Server 1* on the diagram) you have to specify `client=yes`. On the other side it's `client=no` (server side), it accepts connections only from other stunnel instances and forward them to the exposed service. + +**IMPORTANT** always double check to only use client=yes on the internal container (on private network), never on the proxy + +### PSK + +We use psk (Private Shared Key) for security (easier and more performant than setting up certificates). + +The psk file must be in /etc/stunnel/psk/ and must not be commited in this repository, but remain on the servers only, and be private (`chmod -R go-rwx /etc/stunnel/psk/`) + +To generate a psk, use `pwgen 32`. + +Keep each username really unique for each server, but also for each services (otherwise it will conflict !). + + +### Test configuration before restarting service + +BEWARE that stunnel is used for other services ! + +So don't restart the service without first testing your configuration. + +On way to test is to try to start another instance of stunnel. It will fail because it's unable to bind to already occupied ports, but you will be able to see if configuration was parsed correctly: `stunnel /etc/stunnel/off.conf` + +When you are ready, you can use: `systemctl restart stunnel@off` + + +## Systemd service + +The stunnel@off service is the one that correspond to `/etc/stunnel/off.conf` + +We had to override the systemctl service a bit for pid file to work: +* stunnel needs to be launch as `root` to get some privilege (and then change user) +* but pid file is created with running user +* so we use group stunnel4 in service definition and add group write permission to runtime directory to ensure stunnel can create the pid file + +Also note that it is launched in foreground for we let systemd handle the process. + + + + +