1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

My server setup notes

Discussion in 'Linux / BSD / Mac OS X' started by W1zzard, Aug 9, 2014.

  1. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    Dumping them here, in case some random internet person finds them via Google:

    This is for CentOS 7 + Docker + GlusterFS + Pacemaker, we are running all our other services inside Docker containers that are managed via Pacemaker

    Code:
    centos.mirror.constant.com/7/os/x86_64/
    
    yum -y remove audit iprutils i*firmware libertas-*-firmware
    rpm -e alsa-tools-firmware alsa-firmware aic94xx-firmware fxload
    rpm -e postfix
    
    rpm -i http://mirror.de.leaseweb.net/epel/beta/7/x86_64/epel-release-7-0.2.noarch.rpm
    yum -y update
    yum -y install chrony tar telnet mc nano wget psmisc sysstat iftop iotop screen bind-utils net-tools xfsprogs traceroute tcpdump rsync mysql bash-completion php-cli iptraf hdparm strace
    yum -y install docker kvm qemu-kvm libvirt virt-clone pacemaker pcs
    systemctl enable docker
    echo DOCKER_OPTS="-r=false" > /etc/sysconfig/docker
    
    sed -i -e"s/SELINUX=enforcing$/SELINUX=disabled/" /etc/selinux/config
    
    echo "net.ipv4.conf.all.arp_ignore=1" >> /etc/sysctl.conf
    echo "net.ipv4.ip_nonlocal_bind=1" >> /etc/sysctl.conf
    echo "net.netfilter.nf_conntrack_max=10000000" >> /etc/sysctl.conf
    echo "net.netfilter.nf_conntrack_tcp_timeout_established=7875" >> /etc/sysctl.conf
    echo "net.core.netdev_max_backlog=65535" >> /etc/sysctl.conf
    echo "net.ipv4.ip_local_port_range=1024 65535" >> /etc/sysctl.conf
    
    echo "/swapfile none swap defaults 0 0" >> /etc/fstab
    dd if=/dev/zero of=/swapfile bs=1M count=1024
    chmod 600 /swapfile
    mkswap /swapfile
    swapon -a
    
    echo "password" | passwd --stdin hacluster
    
    yum -y install iptables-services
    cat <<EOF > /etc/sysconfig/iptables
    *filter
    :INPUT ACCEPT [0:0]
    :FORWARD ACCEPT [0:0]
    :OUTPUT ACCEPT [0:0]
    -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
    -A INPUT -p icmp -j ACCEPT
    -A INPUT -i lo -j ACCEPT
    
    # Always allow internal traffic
    -A INPUT -i br0 -j ACCEPT
    -A INPUT -i eth0 -j ACCEPT
    
    # Docker images
    -A INPUT -i br1 -m conntrack --ctstate NEW -d tpuadsrv-vip-ext -m tcp -p tcp --dport 80 -j ACCEPT
    -A INPUT -i br1 -m conntrack --ctstate NEW -d tpuwww-vip-ext -m tcp -p tcp --dport 80 -j ACCEPT
    -A INPUT -i br1 -m conntrack --ctstate NEW -d tpucdn-vip-ext -m tcp -p tcp --dport 80 -j ACCEPT
    
    # This host
    -A INPUT -i br1 -m conntrack --ctstate NEW -d 108.61.17.98 -m tcp -p tcp --dport 22 -j ACCEPT
    
    -A INPUT -i br1 -j REJECT --reject-with icmp-host-prohibited
    
    # Need ACCEPT for virtual interfaces
    -A INPUT -j ACCEPT
    
    -A FORWARD -j REJECT --reject-with icmp-host-prohibited
    COMMIT
    EOF
    systemctl enable iptables
    yum -C -y remove firewalld --setopt="clean_requirements_on_remove=1"
    
    yum -C -y remove authconfig --setopt="clean_requirements_on_remove=1"
    
    yum -y install exim
    perl -i -pe 'BEGIN{undef $/;} s/(daemon_smtp_ports =)/local_interfaces = 127.0.0.1.25\n$1/smg' /etc/exim/exim.conf
    perl -i -pe 'BEGIN{undef $/;} s/(begin routers\s+).*?(begin)/$1tpumail:\n  driver = manualroute\n  transport = remote_msa\n  route_list = * mail.techpowerup.com\n\n$2/smg' /etc/exim/exim.conf
    perl -i -pe 'BEGIN{undef $/;} s/(begin authenticators\s+)(.*?begin)/$1tpumail_login:\n  driver = plaintext\n  public_name = LOGIN\n  hide client_send = : servers\@techpowerup.com : password\n\n$2/smg' /etc/exim/exim.conf
    chmod 600 /etc/exim/exim.conf
    
    nano /etc/default/grub
    remove rhgb quiet
    add consoleblank=0 net.ifnames=0
    grub2-mkconfig -o /boot/grub2/grub.cfg
    
    yum -y autoremove NetworkManager
    
    yum -y install rsyslog
    cat <<END > /etc/rsyslog.conf
    \$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
    \$ModLoad imjournal # provides access to the systemd journal
    \$ModLoad imklog  # provides kernel logging support (previously done by rklogd)
    
    \$WorkDirectory /var/lib/rsyslog
    \$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
    
    \$OmitLocalLogging on
    
    \$IMJournalStateFile imjournal.state
    
    \$ActionQueueFileName fwdRule1 # unique name prefix for spool files
    \$ActionQueueMaxDiskSpace 1g  # 1gb space limit (use as much as possible)
    \$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
    \$ActionQueueType LinkedList  # run asynchronously
    \$ActionResumeRetryCount -1  # infinite retries if host is down
    
    *.* @@logserver-vip
    END
    systemctl start rsyslog
    systemctl enable rsyslog
    
    mkdir -p /var/log/journal
    
    systemctl enable dnsmasq
    systemctl start dnsmasq
    
    # setup network interfaces
    
    # reboot
    # remove old kernel
    
    scp 10.0.2.0:/root/.ssh/authorized_keys ~/.ssh/authorized_keys
    scp 10.0.2.0:/root/.ssh/id_rsa ~/.ssh/id_rsa
    scp 10.0.2.0:/root/.ssh/id_rsa.pub ~/.ssh/id_rsa.pub
    
    scp 10.0.2.0:/etc/hosts /etc/hosts
    
    rm -rf /etc/audit/ /etc/firewalld/ /etc/NetworkManager/ /var/lib/NetworkManager/ /var/log/audit/ /var/log/messages /var/log/maillog /var/lib/postfix/ /var/spool/postfix/
    
    scp node2:/etc/corosync/authkey /etc/corosync/authkey
    scp node2:/etc/corosync/corosync.conf /etc/corosync/corosync.conf
    
    systemctl enable corosync pacemaker pcsd
    systemctl restart corosync pacemaker pcsd
    
    pcs cluster auth
    
    pcs cluster setup cluster node1 node2
    
    cd /etc/yum.repos.d/
    wget http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo
    
    yum -y install glusterfs-server attr
    systemctl enable glusterd
    systemctl start glusterd
    
    ## replace glusterfs node
    on other node: grep node3 /var/lib/glusterd/peers/*
    
    echo UUID=1d4bbd3c-85e2-4661-b41d-4db27ad7633b>/var/lib/glusterd/glusterd.info
    systemctl stop glusterd
    gluster peer status
    gluster peer probe node1
    gluster volume sync node1
    systemctl restart glusterfsd
    
    ## new node
    
    mkfs.xfs /dev/sdb1
    mkdir /mnt/ssd
    echo "/dev/sdb1 /mnt/ssd xfs noatime,discard 1 2" >> /etc/fstab
    mount -a
    
    mkfs.xfs -i size=512 /dev/sda3
    mkdir /mnt/sda3
    echo "/dev/sda3 /mnt/sda3 xfs defaults 1 2" >> /etc/fstab
    mount -a
    mkdir /mnt/brick/gv0
    
    
    mkdir /storage
    echo "localhost:/gv0 /storage glusterfs defaults,_netdev 0 0" >> /etc/fstab
    mount -a
    
    /bin/cp /storage/dockerfiles/docker-enter /usr/local/sbin
    
    gluster volume create gv0 replica 3 node1:/mnt/sda3/gv0 node2:/mnt/sda3/gv0 node3:/mnt/sda3/gv0
    gluster volume start gv0
    
    setfattr -x trusted.glusterfs.volume-id /mnt/sda3/gv0
    setfattr -x trusted.gfid /mnt/sda3/gv0
    rm -rf /mnt/sda3/gv0/.glusterfs
    
    pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip= 108.61.17.99 cidr_netmask=32 op monitor interval=30s
    
    pcs resource create ocf:heartbeat:IPaddr2 ip= 108.61.17.99 cidr_netmask=32 op monitor interval=30s
    
     
    Last edited: Aug 10, 2014
    digibucc, McSteel, AsRock and 4 others say thanks.
  2. silentbogo

    silentbogo

    Joined:
    Nov 20, 2013
    Messages:
    153 (0.50/day)
    Thanks Received:
    46
    Thx, W1zz!
     
  3. Easy Rhino

    Easy Rhino Linux Advocate

    Joined:
    Nov 13, 2006
    Messages:
    13,423 (4.68/day)
    Thanks Received:
    3,240
    i was going to ask why not run gluster natively on centos and then read that there is no support for centos 7 and they provide a docker for it. crazy days we live in.
     
  4. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    uhm? we are running glusterfs natively on our servers, on centos7

    repo is here: http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/

    glusterfs works extremely well and is super robust. really love it. like all cluster filesystems it's slow, especially for web loads (small files), so avoid extra stat() calls by using php opcache with opcache.revalidate_freq and put temporary files on local hdd/ssd/tmpfs
     
    Last edited: Aug 10, 2014
    cadaveca says thanks.
  5. Easy Rhino

    Easy Rhino Linux Advocate

    Joined:
    Nov 13, 2006
    Messages:
    13,423 (4.68/day)
    Thanks Received:
    3,240
    oh i see. i read other people just using docker to install and run glusterfs rather than a third party repo.
     
  6. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    since gluster has to be up 24/7 and on all our servers i chose to not put it inside docker

    Our docker containers:
    [​IMG]
     
    Easy Rhino says thanks.
  7. Easy Rhino

    Easy Rhino Linux Advocate

    Joined:
    Nov 13, 2006
    Messages:
    13,423 (4.68/day)
    Thanks Received:
    3,240
    that is just pure win
     
  8. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    Note to self: no matter how often you do the dry run and think you got your method right. Always double and triple check the results.

    In the final move I forgot to convert our databases to InnoDB, so no Galera replication happened, when I rebooted the primary DB node earlier today, another node took over, which never saw any DB updates since Saturday...
     
  9. Easy Rhino

    Easy Rhino Linux Advocate

    Joined:
    Nov 13, 2006
    Messages:
    13,423 (4.68/day)
    Thanks Received:
    3,240
    doh! i always write down (copy/paste) the commands i use so when i do it in production there is no question.
     
  10. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    So did I, except for the database move, which was a bit tricky ..
    1. dump the whole db
    2. fix the dump to not overwrite the mysql table
    3. fix the dump to create innodb instead of mysql
    4. load the dump
    5. sync all db servers

    somehow i forgot to do step 3 in the final run (did it in all test-runs)
     
  11. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    Note to self: don't change a MEMORY table to InnoDB to get it replicated while dozens of inserts and deletes are running on it

    Code:
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] mysqld: Can't find record in 'session_log'
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table techpowerup_ads.session_log; Can't find record in 'session_log', Error_code: 1032; handler error HA_ERR
    _KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1094, Internal MariaDB error code: 1032
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [Warning] WSREP: RBR event 2 Delete_rows_v1 apply warning: 120, 10694450
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [Warning] WSREP: Failed to apply app buffer: seqno: 10694450, status: 1
    Aug 14 21:15:09 node1 mysqld: #011 at galera/src/trx_handle.cpp:apply():340
    Aug 14 21:15:09 node1 mysqld: Retrying 2th time
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] mysqld: Can't find record in 'session_log'
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table techpowerup_ads.session_log; Can't find record in 'session_log', Error_code: 1032; handler error HA_ERR
    _KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1094, Internal MariaDB error code: 1032
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [Warning] WSREP: RBR event 2 Delete_rows_v1 apply warning: 120, 10694450
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [Warning] WSREP: Failed to apply app buffer: seqno: 10694450, status: 1
    Aug 14 21:15:09 node1 mysqld: #011 at galera/src/trx_handle.cpp:apply():340
    Aug 14 21:15:09 node1 mysqld: Retrying 3th time
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] mysqld: Can't find record in 'session_log'
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table techpowerup_ads.session_log; Can't find record in 'session_log', Error_code: 1032; handler error HA_ERR
    _KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1094, Internal MariaDB error code: 1032
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [Warning] WSREP: RBR event 2 Delete_rows_v1 apply warning: 120, 10694450
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [Warning] WSREP: Failed to apply app buffer: seqno: 10694450, status: 1
    Aug 14 21:15:09 node1 mysqld: #011 at galera/src/trx_handle.cpp:apply():340
    Aug 14 21:15:09 node1 mysqld: Retrying 4th time
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] mysqld: Can't find record in 'session_log'
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table techpowerup_ads.session_log; Can't find record in 'session_log', Error_code: 1032; handler error HA_ERR
    _KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1094, Internal MariaDB error code: 1032
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [Warning] WSREP: RBR event 2 Delete_rows_v1 apply warning: 120, 10694450
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] WSREP: Failed to apply trx: source: 5508b395-23c8-11e4-9945-bf10217e983b version: 3 local: 0 state: APPLYING flags: 1 conn_id: 1080867 trx_id: 51654636 seqnos (l:
     1102251, g: 10694450, s: 10694449, d: 10694356, ts: 17218294333149)
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] WSREP: Failed to apply trx 10694450 4 times
    Aug 14 21:15:09 node1 mysqld: 140814 21:15:00 [ERROR] WSREP: Node consistency compromized, aborting...
    
    kabooom!
     
  12. VulkanBros

    VulkanBros

    Joined:
    Jan 31, 2005
    Messages:
    1,341 (0.38/day)
    Thanks Received:
    280
    Location:
    The Pico Mundo Grill
    Hmmm...maybe a stupid question - This Gluster file system - are you using that for gathering all types of internal storage (NAS, HDD, whatever)
    or can it also be used to gather cloud based or abroad storage?
     
    Crunching for Team TPU
  13. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    No, it's just to share files locally, basically to replace NFS (which is kinda impossible to scale to multiple active servers)

    http://blog.gluster.org/category/geo-replication/
    It can do geo replication, not sure how well that works and how slow it is
     
    Last edited: Aug 14, 2014
  14. VulkanBros

    VulkanBros

    Joined:
    Jan 31, 2005
    Messages:
    1,341 (0.38/day)
    Thanks Received:
    280
    Location:
    The Pico Mundo Grill
    Ups oh i see - locally - for internal dev.

    EDIT: From http://www.gluster.org/documentation/About_Gluster/

    "GlusterFS is an open source, distributed file system capable of scaling to several petabytes (actually, 72 brontobytes!)"

    Jesus - 72 brontobytes - it has more zero┬┤s in it, than I have hair.......
     
    Last edited: Aug 14, 2014
    Crunching for Team TPU
  15. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    we use it to share the web data like php scripts, images, but everything has a caching layer in front because glusterfs is quite slow

    Edit: GlusterFS is incredibly robust and its self-heal works better than anything I've ever seen.
     
    Last edited: Aug 14, 2014
  16. VulkanBros

    VulkanBros

    Joined:
    Jan 31, 2005
    Messages:
    1,341 (0.38/day)
    Thanks Received:
    280
    Location:
    The Pico Mundo Grill
    But why use GlusterFS if it is slow?? Because of scalability / robustness / cost? Why not use NFS or SAN?
     
    Crunching for Team TPU
  17. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    NFS doesn't work with multiple write-active servers, SAN is too expensive and even slower.

    For large file sequential, GlusterFS works really well and is as fast as your network or local storage (just tested 100 MB/s from HDD, so just same as local). The problem are small files like web scripts
     
  18. VulkanBros

    VulkanBros

    Joined:
    Jan 31, 2005
    Messages:
    1,341 (0.38/day)
    Thanks Received:
    280
    Location:
    The Pico Mundo Grill
    Have you tried FreeNAS? The ZFS filesytem is very robust and on the right hardware an cofigured right it is very fast (~100 MB/Sec).
    We are using it for offloading our VMware backups.
     
    Crunching for Team TPU
  19. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    ZFS is no distributed filesystem as far as I know. Also no ZFS for Linux (unless hacked in)

    GlusterFS is the best solution for our use case. What happens if you pull the plug of your ZFS server? With GlusterFS the other GlusterFS servers in the cluster will just continue working, the clients will never notice that a plug was ever pulled, they can continue reading and writing. Once the stopped machine comes back up, it will rejoin the cluster, self-heal and magically just work.

    For backups, ZFS is a good choice, how often do you scrub your disks? Using deduplication? Online of offline dedup?
     
    Last edited: Aug 14, 2014
  20. VulkanBros

    VulkanBros

    Joined:
    Jan 31, 2005
    Messages:
    1,341 (0.38/day)
    Thanks Received:
    280
    Location:
    The Pico Mundo Grill
    Availability is the key - okay, and yes ZFS is not a distributed filesystem.

    We scrub the volumes every 7 days (due to our production cycle)

    We have tested deduplication, but found it to resource intensive. The newest ZFS system compresses very well so we use that instead of dedupe.
     
    Crunching for Team TPU
  21. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    o_O they have DKMS packages for ZFS now?! yay. No use for those new web servers, but our EU file backup box could definitely use it.

    Oh, and if you have to backup lots of very similar small files each day, look into using rsnapshot. It's low-tech but works extremely well to conserve space. We've been using it for over 2 years in production.
     
  22. VulkanBros

    VulkanBros

    Joined:
    Jan 31, 2005
    Messages:
    1,341 (0.38/day)
    Thanks Received:
    280
    Location:
    The Pico Mundo Grill
    Thanks - testing rsnapshot right now! Another possibility is to use OwnCloud in conjunction with FreeNAS......
     
    Crunching for Team TPU
  23. W1zzard

    W1zzard Administrator Staff Member

    Joined:
    May 14, 2004
    Messages:
    14,887 (3.93/day)
    Thanks Received:
    11,639
    [​IMG]

    Infrastructure changed to have a www frontend (haproxy) which forwards traffic internally between clones for faster failover in case of node failure. preliminary tests suggest 0-2 seconds .. the frontend can move freely between nodes in case it fails on node1.

    fear my army of clones :)
     
    cadaveca says thanks.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)

Share This Page