Infrastructure: Difference between revisions

From Open Food Facts wiki
(→‎Current infrastructure: server /CPU Details)
No edit summary
 
(30 intermediate revisions by 6 users not shown)
Line 1: Line 1:
[[Category:ProductOpener]]
[[Category:ProductOpener]]
[[Category:Developer]]
[[Category:Developer]]
= Infrastructure =
[[Category:Roadmap]]
[[Category:Infrastructure]]


This page describes the hardware/software infrastructure for the Open Food Facts + Open Beauty Facts projects.
Welcome :-) <br>
This page describes the hardware/software infrastructure for the Open Food Facts, Open Beauty Facts, Open Pet Food Facts and Open Products Facts projects, as well as all the complementary deployments.<br>
'''It is not updated anymore, as we have moved to GitHub for planning, postmortems…'''<br>
Please check our coordination repo: https://github.com/openfoodfacts/openfoodfacts-infrastructure<br>
Note, we also have an #infrastructure channel on Slack, and a monthly video call. We can arrange ad-hoc calls if you'd like to get started faster.
== Get in touch ==
{{Box
| 1    =  Slack channel
| 2    =  [https://openfoodfacts.slack.com/messages/C1FPYCWM7/ #infrastructure]
}}
 
 
== Infrastructure planning ==
 
* Hardware needs: [[Infrastructure Planning]]


== Current infrastructure ==
== Current infrastructure ==


'''off1''', dedicated free.org server:
=== off1 ===
* Hardware configuration:
Dedicated free.org server:
** Dell 1U R440 server (1 Xeon Silver 4110)
 
** Disks:
'''Hardware configuration:'''
*** 2x4 TB SATA HDD (hardware RAID1) : /srv
* Dell 1U R440 server (1 Xeon Silver 4110: 8c/16t)
*** 2x14 TB SATA HDD (hardware RAID1) : /srv2 (images, imports)
* RAM: 128GB (DDR4@2666)
* Running services:
* Disks:
** OS: debian 9.5
** 2x4 TB SATA HDD (hardware RAID1) : /srv
** 1 nginx reverse proxy to serve static files
** 2x14 TB SATA HDD (hardware RAID1) : /srv2 (images, imports)
** 1 apache + mod_perl to generate dynamic pages + API
'''Running services:'''
*** running 4 separate instances on different ports: OFF, OBF, OPFF, OPF
* OS: debian 9.5
* 1 nginx reverse proxy to serve static files
* 1 apache + mod_perl to generate dynamic pages + API
** running 4 separate instances on different ports: OFF, OBF, OPFF, OPF


'''off2''', dedicated free.org server:
=== off2 ===
* Hardware configuration:
Dedicated free.org server:
** Dell 1U R440 server (1 Xeon Silver 4110)
** Disks:
*** 2x4 TB SATA HDD (hardware RAID1) : /srv
*** 2x14 TB SATA HDD (hardware RAID1) : /srv2 (images, imports)
* Running services:
** OS: debian 9.5
** 1 apache + mod_perl (not used)
** 1 MongoDB (production)


'''dev''', dedicated OVH server (rented):
'''Hardware configuration:'''
* Hardware configuration:
* Dell 1U R440 server (1 Xeon Silver 4110: 8c/16t)
** Super Micro server (1 Xeon D-1520)
* RAM: 128GB (DDR4@2666)
** Storage: 4 x 2TB HDD, software RAID1 / /boot + LVM /srv /srv2
* Disks:
** RAM: 32GB (DDR4@2666)
** 2x4 TB SATA HDD (hardware RAID1) : /srv
** CPU: Xeon D-1520
** 2x14 TB SATA HDD (hardware RAID1) : /srv2 (images, imports)
* Running services:
'''Running services:'''
** OS: debian 9.8
* OS: debian 9.5
** 1 robotoff (dockerized)
* 1 apache + mod_perl (not used)
** 1 nginx reverse proxy
* 1 MongoDB (production)
** 1 apache + mod_perl (https://world.dev.openfoodfacts.org - auth: off / off)
* Robotoff (dockerized)
** 1 mongodb


=== History ===
=== no name yet ===
Dedicated OVH - soyoustart server (rented):
 
'''Hardware configuration:'''
* E3-SAT-1-32
* Storage: SoftRaid 3x2To SATA
* RAM: 32GB DDR3 1333 MHz
* CPU: Xeon E3-1245v2 (4c/8th)
'''Running services:'''
* wiki
* zammad (soon)
 
=== dev ===
Dedicated OVH - soyoustart server (rented):
 
'''Hardware configuration:'''
* Super Micro server (1 Xeon D-1520)
* Storage: 4 x 2TB HDD, software RAID1 / /boot + LVM /srv /srv2
* RAM: 32GB (DDR4@2666)
* CPU: Xeon D-1520
'''Running services:'''
* OS: debian 9.8
* 1 nginx reverse proxy
* 1 apache + mod_perl (https://openfoodfacts.dev - auth: off / off)
* 1 mongodb
 
=== dev2 ===
Dedicated server (self-hosted by [[User:Cquest|cquest]])
 
'''Hardware configuration:'''
* Dell R710 server (2U)
* Storage (ZFS pool) with:
** 6 x 2TB HITACHI HUS72402 (SAS)
** 500GB Samsung 860 SSD: OS and 256GB read cache (L2ARC)
** 16GB Intel Optane SSD: write journal (ZIL)
* RAM: 96GB (DDR3@1333)
* CPU: 2 Xeon X5675 = 12 cores @ 3.07GHz
'''Running services:'''
* OS: Proxmox 6 (debian 10 based)
 
'''Network''':
* uplink: optic fiber from OVH (1G down/500M up) with dedicated IPv4 and IPv6/72
* ethernet 1Gbps (bonded for backup)
* infiniband 40Gbps (proxmox cluster backbone + access to storage server)
 
=== backup ===
Shared storage server (self-hosted by [[User:Cquest|cquest]])
 
'''Hardware configuration:'''
* Supermicro storage server (4U)
* Storage (ZFS pool) with:
** 36 SAS bays (populated disks evolving with time are needs)
** 1TB Samsung 860 SSD: OS and 256GB read cache (L2ARC)
** 16GB Intel Optane SSD: write journal (ZIL)
* RAM: 128GB
* CPU:
'''Running services:'''
* OS: Proxmox 6 (debian 10 based)
* Local NFS access from dev2 to /srv /srv2 backups (nghtly rsync + ZFS snapshots)
* Proxmox CT/VM replication
 
=== ovh1 ===
VM/CT server for computing usages.
 
See [[Infrastructure/VM cluster]].
 
'''Hardware configuration:'''
* CPU: AMD Epyc 7451 (24c/48t)
* RAM: 256GB
* 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.1) on vRack
* 2 x 1TB SSD NVMe
 
Partition / storage:
* Ext4 root partition: 32Gb on RAID1 /dev/md2
* 512MB swap
* 920GB ZFS pool on remaining space (mirror)
 
'''Running services:'''
* OS: Proxmox 6 (based on debian 10 buster)
 
=== ovh2 ===
VM/CT server for computing usages (same as ovh1).
 
See [[Infrastructure/VM cluster]].
 
'''Hardware configuration:'''
* CPU: AMD Epyc 7451 (24c/48t)
* RAM: 256GB
* 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.2) on vRack
* 2 x 1TB SSD NVMe
 
Partition / storage:
* Ext4 root partition: 32Gb on RAID1 /dev/md2
* 512MB swap
* 920GB ZFS pool on remaining space (mirror)
 
'''Running services:'''
* OS: Proxmox 6 (based on debian 10 buster)
 
=== ovh3 ===
Mainly storage server, VM/CT host is needed.
 
See [[Infrastructure/VM cluster]].
 
'''Hardware configuration:'''
* CPU: Intel Xeon-D 1541 (8c/16t)
* RAM: 32GB
* 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.3)
 
Partition / storage:
* Ext4 root partition: 32Gb on SSD
* 512MB swap
* 5GB ZFS ZIL (write journal)
* 472GB ZFS cache (ARC2 read cache)
 
* ZFS storage (6 x 12TB HDD) in RAIDZ2 (2 disks for redundancy, around 42TB available)
 
'''Running services:'''
* OS: Proxmox 6 (based on debian 10 buster)
 
=== ovh4 ===
Mainly storage server, VM/CT host is needed.
 
See [[Infrastructure/VM cluster]].
 
'''Hardware configuration:'''
* CPU: Intel Xeon-D 1541 (8c/16t)
* RAM: 32GB
* 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.4)
 
Partition / storage:
* Ext4 root partition: 32Gb on SSD
* 512MB swap
* 5GB ZFS ZIL (write journal)
* 472GB ZFS cache (ARC2 read cache)
 
* ZFS storage (6 x 12TB HDD) in RAIDZ2 (2 disks for redundancy, around 42TB available)
 
'''Running services:'''
* OS: Proxmox 6 (based on debian 10 buster)
 
== History ==


From 2012 to 2016, OFF and OBF have been running on part of an OVH server (also used for other purposes):
From 2012 to 2016, OFF and OBF have been running on part of an OVH server (also used for other purposes):
Line 681: Line 831:




===== Generate CSS and download dependencies =====
<pre>
/srv/off/yarn install
/srv/off/yarn run build
</pre>


The commands above might fail if the correct versions of yarn and nodejs are not installed. To fix:
<pre>
apt remove cmdtest
apt remove yarn
curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
apt-get update
apt-get install yarn
curl -sL https://deb.nodesource.com/setup_10.x | sudo -E bash -
apt-get install -y nodejs
yarn install
yarn run build
</pre>


==== Cron jobs ====
==== Cron jobs ====

Latest revision as of 07:38, 7 August 2024


Welcome :-)
This page describes the hardware/software infrastructure for the Open Food Facts, Open Beauty Facts, Open Pet Food Facts and Open Products Facts projects, as well as all the complementary deployments.
It is not updated anymore, as we have moved to GitHub for planning, postmortems…
Please check our coordination repo: https://github.com/openfoodfacts/openfoodfacts-infrastructure
Note, we also have an #infrastructure channel on Slack, and a monthly video call. We can arrange ad-hoc calls if you'd like to get started faster.

Get in touch

Slack channel


Infrastructure planning

Current infrastructure

off1

Dedicated free.org server:

Hardware configuration:

  • Dell 1U R440 server (1 Xeon Silver 4110: 8c/16t)
  • RAM: 128GB (DDR4@2666)
  • Disks:
    • 2x4 TB SATA HDD (hardware RAID1) : /srv
    • 2x14 TB SATA HDD (hardware RAID1) : /srv2 (images, imports)

Running services:

  • OS: debian 9.5
  • 1 nginx reverse proxy to serve static files
  • 1 apache + mod_perl to generate dynamic pages + API
    • running 4 separate instances on different ports: OFF, OBF, OPFF, OPF

off2

Dedicated free.org server:

Hardware configuration:

  • Dell 1U R440 server (1 Xeon Silver 4110: 8c/16t)
  • RAM: 128GB (DDR4@2666)
  • Disks:
    • 2x4 TB SATA HDD (hardware RAID1) : /srv
    • 2x14 TB SATA HDD (hardware RAID1) : /srv2 (images, imports)

Running services:

  • OS: debian 9.5
  • 1 apache + mod_perl (not used)
  • 1 MongoDB (production)
  • Robotoff (dockerized)

no name yet

Dedicated OVH - soyoustart server (rented):

Hardware configuration:

  • E3-SAT-1-32
  • Storage: SoftRaid 3x2To SATA
  • RAM: 32GB DDR3 1333 MHz
  • CPU: Xeon E3-1245v2 (4c/8th)

Running services:

  • wiki
  • zammad (soon)

dev

Dedicated OVH - soyoustart server (rented):

Hardware configuration:

  • Super Micro server (1 Xeon D-1520)
  • Storage: 4 x 2TB HDD, software RAID1 / /boot + LVM /srv /srv2
  • RAM: 32GB (DDR4@2666)
  • CPU: Xeon D-1520

Running services:

dev2

Dedicated server (self-hosted by cquest)

Hardware configuration:

  • Dell R710 server (2U)
  • Storage (ZFS pool) with:
    • 6 x 2TB HITACHI HUS72402 (SAS)
    • 500GB Samsung 860 SSD: OS and 256GB read cache (L2ARC)
    • 16GB Intel Optane SSD: write journal (ZIL)
  • RAM: 96GB (DDR3@1333)
  • CPU: 2 Xeon X5675 = 12 cores @ 3.07GHz

Running services:

  • OS: Proxmox 6 (debian 10 based)

Network:

  • uplink: optic fiber from OVH (1G down/500M up) with dedicated IPv4 and IPv6/72
  • ethernet 1Gbps (bonded for backup)
  • infiniband 40Gbps (proxmox cluster backbone + access to storage server)

backup

Shared storage server (self-hosted by cquest)

Hardware configuration:

  • Supermicro storage server (4U)
  • Storage (ZFS pool) with:
    • 36 SAS bays (populated disks evolving with time are needs)
    • 1TB Samsung 860 SSD: OS and 256GB read cache (L2ARC)
    • 16GB Intel Optane SSD: write journal (ZIL)
  • RAM: 128GB
  • CPU:

Running services:

  • OS: Proxmox 6 (debian 10 based)
  • Local NFS access from dev2 to /srv /srv2 backups (nghtly rsync + ZFS snapshots)
  • Proxmox CT/VM replication

ovh1

VM/CT server for computing usages.

See Infrastructure/VM cluster.

Hardware configuration:

  • CPU: AMD Epyc 7451 (24c/48t)
  • RAM: 256GB
  • 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.1) on vRack
  • 2 x 1TB SSD NVMe

Partition / storage:

  • Ext4 root partition: 32Gb on RAID1 /dev/md2
  • 512MB swap
  • 920GB ZFS pool on remaining space (mirror)

Running services:

  • OS: Proxmox 6 (based on debian 10 buster)

ovh2

VM/CT server for computing usages (same as ovh1).

See Infrastructure/VM cluster.

Hardware configuration:

  • CPU: AMD Epyc 7451 (24c/48t)
  • RAM: 256GB
  • 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.2) on vRack
  • 2 x 1TB SSD NVMe

Partition / storage:

  • Ext4 root partition: 32Gb on RAID1 /dev/md2
  • 512MB swap
  • 920GB ZFS pool on remaining space (mirror)

Running services:

  • OS: Proxmox 6 (based on debian 10 buster)

ovh3

Mainly storage server, VM/CT host is needed.

See Infrastructure/VM cluster.

Hardware configuration:

  • CPU: Intel Xeon-D 1541 (8c/16t)
  • RAM: 32GB
  • 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.3)

Partition / storage:

  • Ext4 root partition: 32Gb on SSD
  • 512MB swap
  • 5GB ZFS ZIL (write journal)
  • 472GB ZFS cache (ARC2 read cache)
  • ZFS storage (6 x 12TB HDD) in RAIDZ2 (2 disks for redundancy, around 42TB available)

Running services:

  • OS: Proxmox 6 (based on debian 10 buster)

ovh4

Mainly storage server, VM/CT host is needed.

See Infrastructure/VM cluster.

Hardware configuration:

  • CPU: Intel Xeon-D 1541 (8c/16t)
  • RAM: 32GB
  • 1 GB/s public bandwith; 3 GB/s private bandwith (10.0.0.4)

Partition / storage:

  • Ext4 root partition: 32Gb on SSD
  • 512MB swap
  • 5GB ZFS ZIL (write journal)
  • 472GB ZFS cache (ARC2 read cache)
  • ZFS storage (6 x 12TB HDD) in RAIDZ2 (2 disks for redundancy, around 42TB available)

Running services:

  • OS: Proxmox 6 (based on debian 10 buster)

History

From 2012 to 2016, OFF and OBF have been running on part of an OVH server (also used for other purposes):

  • cat /etc/debian_version -> 6.0.10 (squeeze)
  • 1 apache 2.4 setup as reverse proxy to serve static files (manually built)
  • 1 apache 1.3 + mod_perl for OFF (manually built)
  • 1 apache 1.3 + mod_perl for OBF (manually built)
  • MongoDB 2.4.12 (installed from mongodb provided packages)

From 2016 to 2018, 1 dedicated OVH server running:

  • 1 nginx reverse proxy to serve static files (installed with apt-get)
  • 1 apache 2.4 + mod_perl (installed with apt-get)
    • running two separate instances on different ports, 1 for OFF and 1 for OBF
  • 1 MongoDB

2012-2016 server install log

OFF and OBF have been hosted from 2012 to 2016 on a (now very old) OVH dedicated server that is also hosting other projects. On June 13th 2016, a new dedicated server has been ordered specifically for OFF and OBF.

Hardware

Server setup

Server configuration

  • uname -a
    • Linux ns3362784.ip-37-187-74.eu 3.14.32-xxxx-grs-ipv6-64 #7 SMP Wed Jan 27 18:05:09 CET 2016 x86_64 GNU/Linux
  • perl -v
    • This is perl 5, version 20, subversion 2 (v5.20.2) built for x86_64-linux-gnu-thread-multi

Basic configuration

  • apt-get update
  • apt-get upgrade
  • apt-get install fail2ban
  • apt-get install sudo
  • apt-get install build-essential
  • apt-get install git


Users

  • admin users with sudo access
  • off user

IP failover

1 ip failover for each service so that we can easily switch servers

  • OFF: 178.33.252.125
  • OBF: 178.33.104.169

Add to /etc/network/interfaces :


post-up /sbin/ifconfig eth0:0 178.33.104.169 netmask 255.255.255.255 broadcast 178.33.104.169
post-down /sbin/ifconfig eth0:0 down

post-up /sbin/ifconfig eth0:1 178.33.252.125 netmask 255.255.255.255 broadcast 178.33.252.125
post-down /sbin/ifconfig eth0:1 down

/etc/init.d/networking restart

  • [ ok ] Restarting networking (via systemctl): networking.service.

DNS

Product Opener needs a domain, with a A record for the domain itself and another wildcard A record for all subdomains.

  • openbeautyfacts.org. 0 A 178.33.104.169
  • *.openbeautyfacts.org. 0 A 178.33.104.169

For testing the new server, we will be using openfoodfacts.eu

Product Opener dependencies

exim
  • apt-get install exim4
  • dpkg-reconfigure exim4-config
    • Internet Site mail is sent by smtp
    • 127.0.0.1
MongoDB

See https://docs.mongodb.com/manual/tutorial/install-mongodb-on-debian/

apt-get install mongodb

MongoDB shell version: 2.4.10

service mongod stop mv /var/lib/mongodb /home/mongodb

vi /etc/mongod.conf

#  dbPath: /var/lib/mongodb
  dbPath: /home/mongodb

service mongod start

Stars with some warnings:

mongo
MongoDB shell version: 3.2.7
connecting to: test
Server has startup warnings:
2016-06-13T19:34:08.245+0200 I CONTROL  [initandlisten]
2016-06-13T19:34:08.246+0200 I CONTROL  [initandlisten] ** WARNING: Cannot detect if NUMA interleaving is enabled. Failed to probe "/sys/devices/system/node/node1": Permission denied
2016-06-13T19:34:08.246+0200 W CONTROL  [initandlisten]
2016-06-13T19:34:08.246+0200 W CONTROL  [initandlisten] Failed to probe "/sys/kernel/mm/transparent_hugepage": Permission denied
2016-06-13T19:34:08.246+0200 W CONTROL  [initandlisten]
2016-06-13T19:34:08.246+0200 W CONTROL  [initandlisten] Failed to probe "/sys/kernel/mm/transparent_hugepage": Permission denied
2016-06-13T19:34:08.246+0200 I CONTROL  [initandlisten]


Apache / mod_perl and nginx

Apache 2 + mod_perl serve the dynamically generated HTML pages from Product Opener.

nginx is installed on port 80 as a reverse proxy. It serves the static files (images, JS, CSS etc.) and proxies the dynamic requests to the Apache server on another port.

apt-get install apache2

  1. stop apache in order to be able to install nginx (default port 80)

service stop apache2

apt-get install nginx

nginx configuration
/etc/nginx/sites-available# more off
##
# You should look at the following URL's in order to grasp a solid understanding
# of Nginx configuration files in order to fully unleash the power of Nginx.
# https://wiki.nginx.org/Pitfalls
# https://wiki.nginx.org/QuickStart
# https://wiki.nginx.org/Configuration
#
# Generally, you will want to move this file somewhere, and start with a clean
# file but keep this around for reference. Or just disable in sites-enabled.
#
# Please see /usr/share/doc/nginx-doc/examples/ for more detailed examples.
##

# Default server configuration
#
server {
        #listen 80 default_server;
        #listen [::]:80 default_server;

        server_name openfoodfacts.org *.openfoodfacts.org openfoodfacts.eu *.openfoodfacts.eu;

        # SSL configuration
        #
        # listen 443 ssl default_server;
        # listen [::]:443 ssl default_server;
        #
        # Self signed certs generated by the ssl-cert package
        # Don't use them in a production server!
        #
        # include snippets/snakeoil.conf;

        root /home/off/html;

        access_log /home/off/logs/nginx.access2.log;
        error_log /home/off/logs/nginx.error2.log;

        gzip on;
        gzip_min_length 1000;


        # Add index.php to the list if you are using PHP
        index index.html index.htm index.nginx-debian.html;

        location ~* \.(eot|ttf|woff|woff2)$ {
                add_header Access-Control-Allow-Origin *;
        }

        location ~ ^/images/products/ {
                add_header Link "<http://creativecommons.org/licenses/by-sa/3.0/>; rel='license'; title='CC-BY-SA 3.0'";
        }

        location ~ ^/(favicon.ico) {
                # First attempt to serve request as file, then
                # as directory, then fall back to displaying a 404.
                try_files $uri $uri/ =404;
        }


        location ~ ^/(images|js|rss|data|files|resources|foundation)/ {
                # First attempt to serve request as file, then
                # as directory, then fall back to displaying a 404.
                try_files $uri $uri/ =404;
        }

        location = /robots.txt {
                try_files $uri $uri/ =404;
        }

        location / {
                proxy_set_header Host $host;
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header       X-Forwarded-For $proxy_add_x_forwarded_for;

                proxy_pass http://127.0.0.1:8001/cgi/display.pl?;
        }

        location /cgi/ {
                proxy_set_header Host $host;
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header       X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_pass http://127.0.0.1:8001;
        }

        # deny access to .htaccess files, if Apache's document root
        # concurs with nginx's one
        #
        #location ~ /\.ht {
        #       deny all;
        #}
}


/etc/nginx/sites-enabled# ln -s /etc/nginx/sites-available/off off rm default

service nginx restart

To check for errors:

systemctl -l status nginx.service

Apache configuration

Note: imagemagick + mod_perl seems to crash Apache in the worker / event MPM, use prefork.

remove event and then:

/etc/apache2-off/mods-enabled# ln -s ../mods-available/mpm_prefork.conf mpm_prefork.conf /etc/apache2-off/mods-enabled# ln -s ../mods-available/mpm_prefork.load mpm_prefork.load

Set the user to off

vi /etc/apache2/envvars

#export APACHE_RUN_USER=www-data
export APACHE_RUN_USER=off
#export APACHE_RUN_GROUP=www-data
export APACHE_RUN_GROUP=off

off.conf:

/etc/apache2/sites-available# cat off.conf
# LoadModule perl_module modules/mod_perl.so

PerlSwitches -I/home/off/lib

PerlWarn On
PerlRequire /home/off/lib/startup_apache2.pl


# log the X-Forwarded-For IP address (the client ip) in access_log
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" proxy

<Location /cgi>
SetHandler perl-script
PerlResponseHandler ModPerl::Registry
PerlOptions +ParseHeaders
Options +ExecCGI
Require all granted
</Location>


<VirtualHost *>
DocumentRoot /home/off/html
ServerName openfoodfacts.org
ErrorLog /home/off/logs/error_log
CustomLog /home/off/logs/access_log combined
LogLevel warn
ScriptAlias /cgi/ "/home/off/cgi/"

<Directory /home/off/html>
Require all granted
</Directory>

</VirtualHost>

PerlPostReadRequestHandler My::ProxyRemoteAddr


/etc/apache2/sites-enabled# ls -lrt
total 0
lrwxrwxrwx 1 root root 35 Jun 13 22:12 000-default.conf -> ../sites-available/000-default.conf
/etc/apache2/sites-enabled# rm 000-default.conf
/etc/apache2/sites-enabled# ln -s ../sites-available/off.conf off.conf

Port 8001


/etc/apache2# vi ports.conf

#Listen 80
Listen 8001

service apache2 restart

To check for errors:

systemctl -l status apache2.service

mkdir /home/off/logs


Debugging:

If Apache does not start, check the apache2 logs (e.g. for apache2-obf) :

/var/log/apache2-obf# tail -f error.log
[Mon Jul 04 15:38:52.616689 2016] [perl:error] [pid 15197:tid 120373734786944] Can't locate /home/obf/cgi/startup_apache2.pl in @INC (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.20.2 /usr/local/share/perl/5.20.2 /usr/lib/x86_64-linux-gnu/perl5/5.20 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.20 /usr/share/perl/5.20 /usr/local/lib/site_perl . /etc/apache2-obf) at (eval 2) line 1.\n
[Mon Jul 04 15:38:52.616762 2016] [perl:error] [pid 15197:tid 120373734786944] Can't load Perl file: /home/obf/cgi/startup_apache2.pl for server ns3362784.ip-37-187-74.eu:0, exiting...

-> forgot to create startup_apache2.pl, going to copy the one from off and change the path in it.

More debugging:

If mod_perl starts, it loads some modules from cgi/startup_apache2.pl, and may fail if some modules are missing. Check /home/obf/logs/error_log:

[Mon Jul 04 18:31:11.501044 2016] [perl:error] [pid 23498:tid 116442525869952] Can't locate Math/Random/Secure.pm in @INC (you may need to install the Math::Random::Secure module)
Running a 2nd Apache instance for OBF

OFF and OBF share the same code (except some configuration in the Config.pm and SiteLang.pm modules), but the code is not made to run two separate servers. So we need two run two separate instances of Apache, one for OFF and one for OBF. This also allows to restart one but not the other, to run different versions of the code etc.

Note: we use the off user for both OFF and OBF so that OBF can read and write in /home/off/users

There is some support to run multiple Apache instances. Documentation is in /usr/share/doc/apache2/README.multiple-instances

sh /usr/share/doc/apache2/examples/setup-instance obf
Setting up /etc/apache2-obf ...
Setting up /etc/init.d/apache2-obf ...
Setting up symlinks: a2enmod-obf a2dismod-obf a2ensite-obf a2dissite-obf a2enconf-obf a2disconf-obf apache2ctl-obf
Setting up /etc/logrotate.d/apache2-obf and /var/log/apache2-obf ...

Configuration files for the new instance are in /etc/apache2-obf :

  • Copy and edit obf.conf:
    • /etc/apache2-obf/sites-available# cp off.conf obf.conf
    • vi obf.conf
      • :%s/off/obf/g
      • ServerName openbeautyfacts.org
  • /etc/apache2-obf/sites-enabled# ls -lrt off.conf
  • lrwxrwxrwx 1 root root 27 Jun 17 14:45 off.conf -> ../sites-available/off.conf
  • /etc/apache2-obf/sites-enabled# rm off.conf
  • /etc/apache2-obf/sites-enabled# ln -s ../sites-available/obf.conf obf.conf

vi /etc/apache2-obf/envvars

#export APACHE_RUN_USER=www-data
export APACHE_RUN_USER=off
#export APACHE_RUN_GROUP=www-data
export APACHE_RUN_GROUP=off


Port 8002


/etc/apache2-obf# vi ports.conf

#Listen 80
Listen 8002

service apache2-obf start
Failed to start apache2-obf.service: Unit apache2-obf.service failed to load: No such file or directory.

Found solution in https://lists.debian.org/debian-apache/2016/03/msg00017.html

systemctl enable apache2-obf
Synchronizing state for apache2-obf.service with sysvinit using update-rc.d...
Executing /usr/sbin/update-rc.d apache2-obf defaults
Executing /usr/sbin/update-rc.d apache2-obf enable
systemctl start apache2-obf


To check for errors:

systemctl -l status apache2-obf.service

mkdir /home/obf/logs


Forwarding client IPs from nginx to Apache configuration

in nginx config:

Apache:

install Apache2::Connection::XForwardedFor

Can't find the mod_perl include dir (reason: path /usr/include/apache2 doesn't exist)

-> just mkdir /usr/include/apache2 and reinstall module

Product Opener

Libraries
  • apt-get install zlib1g-dev
GeoIP updatate

The packaged geoip database installed through "apt-get install geoip-database" is very old, need to install one manually:

root@ns3362784:/home/obf/logs# apt-get install geoip-database
Reading package lists... Done
Building dependency tree
Reading state information... Done
geoip-database is already the newest version.
The following packages were automatically installed and are no longer required:
  libasan0 libboost-dev libboost-filesystem1.55.0
  libboost-program-options1.55.0 libboost-system1.55.0 libboost-thread1.55.0
  libboost1.55-dev libgcc-4.8-dev libgoogle-perftools4 libpcrecpp0 libsnappy1
  libstdc++-4.8-dev libtcmalloc-minimal4 libunwind8 libv8-3.14.5
Use 'apt-get autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
root@ns3362784:/home/obf/logs# cd /usr/share/GeoIP/
root@ns3362784:/usr/share/GeoIP# ls -lrt
total 4900
-rw-r--r-- 1 root root 1061883 Mar 17  2015 GeoIP.dat
-rw-r--r-- 1 root root 3949299 Mar 17  2015 GeoIPv6.dat
root@ns3362784:/usr/share/GeoIP# wget https://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz
--2016-07-15 00:16:08--  https://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz
Resolving geolite.maxmind.com (geolite.maxmind.com)... 2400:cb00:2048:1::6810:262f, 2400:cb00:2048:1::6810:252f, 104.16.38.47, ...
Connecting to geolite.maxmind.com (geolite.maxmind.com)|2400:cb00:2048:1::6810:262f|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 517602 (505K) [application/octet-stream]
Saving to: ‘GeoIP.dat.gz’

GeoIP.dat.gz        100%[=====================>] 505.47K  --.-KB/s   in 0.06s

2016-07-15 00:16:08 (7.76 MB/s) - ‘GeoIP.dat.gz’ saved [517602/517602]

root@ns3362784:/usr/share/GeoIP# gzip -d GeoIP.dat.gz
gzip: GeoIP.dat already exists; do you wish to overwrite (y or n)? y
root@ns3362784:/usr/share/GeoIP# ls -lrt
total 4744
-rw-r--r-- 1 root root 3949299 Mar 17  2015 GeoIPv6.dat
-rw-r--r-- 1 root root  904073 Jul  6 19:50 GeoIP.dat

Make a symbolic link (Perl module seems to think the database is at /usr/local/share/GeoIP/GeoIP.dat )

ln -s /usr/share/GeoIP /usr/local/share/GeoIP
Perl modules

apt-get install libwww-perl libimage-magick-perl libxml-encoding-perl libtext-unaccent-perl libmime-lite-perl libcache-memcached-fast-perl libjson-perl libclone-perl libgraphviz-perl libmime-lite-perl libcrypt-passwdmd5-perl libencode-detect-perl libgraphics-color-perl libbarcode-zbar-perl libxml-feedpp-perl libmongodb-perl liburi-find-perl libxml-simple-perl

Some modules seem not to have Debian packages and must be built using CPAN:

cpan
install URI::Escape::XS
install Encode::Punycode
install GraphViz2
install HTML::Defang
install Algorithm::CheckDigits
install Geo::IP
install Image::OCR::Tesseract
install DateTime::Format::Mail
install DateTime::Format::CLDR
install DateTime::Locale
Symbolic links in lib directory

ls -lrt |grep -- "->"

Make sure all links are pointing to the right path.

/home/off/lib/ProductOpener# ln -s SiteLang_off.pm SiteLang.pm
/home/off/lib/ProductOpener# ln -s Config_off.pm Config.pm

/home/off/lib/ProductOpener# cp Config2_sample.pm Config2.pm
-> put right values for server domain, home path, and mongodb database name

/home/off/scripts# ln -s ../lib/ProductOpener ProductOpener

robots.txt

Since we will run a copy of OFF on a separate domain, add a line to forbid robots completely.

/home/off/html# vi robots.txt
User-agent: *
Disallow: /
Disallow: /cgi
Disallow: /code
~


Data
  • Copy data to new server
    • Populate Mongodb by running /home/off/cgi/update_all_products_from_dir_in_mongodb.pl
  • Copy images
  • Make sur permissions are correct
    • chown -R off deleted* html index ingredients lang invalid lists products users tmp


MongoDB indexes

Some basic indexes, needs to be revisited. https://github.com/openfoodfacts/openfoodfacts-server/issues/341

mongo
use off


db.products.createIndex( { code : 1 }, { background : true });
db.products.createIndex( { unique_scans_n : -1 }, { background : true });
db.products.createIndex( { sortkey : -1 }, { background : true });
db.products.createIndex( { last_modified_t : -1 }, { background : true });

db.products.createIndex( { created_t : 1 }, { background : true });
db.products.createIndex( { lc : 1 }, { background : true });

db.products.createIndex( { _keywords : 1, last_modified_t : -1 }, { background : true });
db.products.createIndex( { _keywords : 1, sortkey : -1 }, { background : true });


db.products.createIndex( { countries_tags : 1 , last_modified_t : -1 }, { background : true });

db.products.createIndex( { countries_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { brands_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { categories_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { labels_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { packaging_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { origins_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { manufacturing_places_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { emb_codes_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { cities_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { ingredients_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { ingredients_from_palm_oil_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { vitamins_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { minerals_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { amino_acids_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { nucleotides_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { other_nutritional_substances_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { allergens_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { traces_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { nova_groups_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { nutrition_grades_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { misc_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { languages_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { users_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { editors_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { informers_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { correctors_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { checkers_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { photographers_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { states_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { entry_dates_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { last_edit_dates_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { purchase_places_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { stores_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { codes_tags : 1 , sortkey : -1 }, { background : true });
db.products.createIndex( { debug_tags : 1 , sortkey : -1 }, { background : true });




Product opener debug

Once Apache starts:

/home/off/logs# tail -f error_log

[Fri Jun 17 14:51:34.123696 2016] [perl:error] [pid 18764:tid 117837813839616] [client 127.0.0.1:35749] Can't locate object method "remote_ip" via package "Apache2::Connection" at /home/off/cgi/startup.pl line 76.\n, referer: http://openfoodfacts.eu/

-> created startup_apache2.pl, loaded in off.conf


Generate CSS and download dependencies
/srv/off/yarn install
/srv/off/yarn run build

The commands above might fail if the correct versions of yarn and nodejs are not installed. To fix:

apt remove cmdtest
apt remove yarn
curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
apt-get update
apt-get install yarn
curl -sL https://deb.nodesource.com/setup_10.x | sudo -E bash -
apt-get install -y nodejs
yarn install
yarn run build

Cron jobs

User root:


Need a ftp client for the OVH FTP backup space (100G)

  • apt-get install ftp

install Filesys::DiskFree


User obf:

  • 35 * * * * /home/obf/scripts/gen_feeds.sh > /dev/null
  • 15 4 * * * /home/obf/scripts/gen_feeds_daily.sh > /dev/null