Proxmox VE Cluster

Proxmox VE Cluster KVM LXC High Availability Ceph ZFS Storage Backup PBS Production Virtualization

FeatureProxmox VEVMware vSphereHyper-V
LicenseFree (AGPL) + Subscription optionalPaid ($$$)Included with Windows Server
HypervisorKVM + LXCESXiHyper-V
Web UIBuilt-in (port 8006)vCenter (separate)Windows Admin Center
HABuilt-in (3+ nodes)vSphere HAFailover Clustering
StorageZFS, Ceph, LVM, NFSVMFS, vSAN, NFSCSV, SMB, iSCSI
BackupPBS (free)Separate productWindows Backup

Cluster Setup

# === Proxmox Cluster Setup ===

# Node 1: Create cluster
# pvecm create my-cluster
# pvecm status

# Node 2: Join cluster
# pvecm add 192.168.1.101
# pvecm status

# Node 3: Join cluster
# pvecm add 192.168.1.101
# pvecm status

# Verify cluster
# pvecm nodes
# pvecm expected 3

# Network configuration (/etc/network/interfaces)
# auto vmbr0
# iface vmbr0 inet static
#     address 192.168.1.101/24
#     gateway 192.168.1.1
#     bridge-ports eno1
#     bridge-stp off
#
# auto vmbr1
# iface vmbr1 inet static
#     address 10.10.10.101/24
#     bridge-ports eno2
#     bridge-stp off
#     # Cluster/Ceph network (10GbE)

# Corosync config (/etc/pve/corosync.conf)
# totem {
#     version: 2
#     cluster_name: my-cluster
#     transport: knet
#     interface {
#         linknumber: 0
#     }
# }

from dataclasses import dataclass

@dataclass
class ClusterRequirement:
    component: str
    minimum: str
    recommended: str
    purpose: str

requirements = [
    ClusterRequirement("Nodes", "3 (for quorum)",
        "3-5 nodes, odd number",
        "Quorum voting, HA failover"),
    ClusterRequirement("CPU", "4 cores per node",
        "16-32 cores (Xeon/EPYC)",
        "VM/Container processing"),
    ClusterRequirement("RAM", "16 GB per node",
        "64-256 GB ECC RAM",
        "VM memory + ZFS ARC cache"),
    ClusterRequirement("Storage", "SSD 256 GB",
        "NVMe 1-2TB + HDD for bulk",
        "VM disks, Ceph OSD, ZFS pool"),
    ClusterRequirement("Network", "1GbE × 2",
        "10GbE × 2 (management + storage)",
        "Cluster comms, Ceph replication, migration"),
    ClusterRequirement("UPS", "ไม่บังคับ",
        "UPS + NUT monitoring",
        "ป้องกัน Data Loss จากไฟดับ"),
]

print("=== Cluster Requirements ===")
for r in requirements:
    print(f"  [{r.component}] Min: {r.minimum}")
    print(f"    Recommended: {r.recommended}")
    print(f"    Purpose: {r.purpose}")

Storage and Backup

# === Storage Configuration ===

# ZFS Pool creation
# zpool create -f rpool mirror /dev/sda /dev/sdb
# zfs set compression=lz4 rpool
# zfs set atime=off rpool
# pvesm add zfspool local-zfs -pool rpool/data

# Ceph setup (on each node)
# pveceph install
# pveceph init --network 10.10.10.0/24
# pveceph mon create
# pveceph osd create /dev/sdc
# pveceph osd create /dev/sdd
# pveceph pool create vm-pool --pg_num 128

# Backup with PBS
# apt install proxmox-backup-server
# proxmox-backup-manager datastore create backups /mnt/backup
# # In PVE: Datacenter → Storage → Add → Proxmox Backup Server
# # Schedule: vzdump --all --mode snapshot --storage pbs --schedule daily

@dataclass
class StorageOption:
    storage: str
    type_: str
    performance: str
    ha_ready: str
    best_for: str
    cost: str

storages = [
    StorageOption("ZFS (local)",
        "Local, Mirror/RAIDZ", "สูงมาก (NVMe)",
        "ใช้ Replication สำหรับ HA",
        "Homelab, Small Cluster 2-3 nodes",
        "ต่ำ (ใช้ Disk ที่มี)"),
    StorageOption("Ceph (distributed)",
        "Distributed, 3x Replication", "สูง (10GbE required)",
        "HA ในตัว เมื่อ OSD ล่ม Recover อัตโนมัติ",
        "Production Cluster 3+ nodes",
        "กลาง (ต้อง 10GbE + NVMe)"),
    StorageOption("NFS",
        "Shared, Network filesystem", "กลาง",
        "Shared แต่ NFS Server เป็น SPOF",
        "Simple Shared Storage, Migration",
        "ต่ำ"),
    StorageOption("iSCSI/FC SAN",
        "Block storage over network", "สูงมาก",
        "HA ขึ้นกับ SAN design",
        "Enterprise, existing SAN infrastructure",
        "สูง (SAN hardware)"),
]

print("=== Storage Options ===")
for s in storages:
    print(f"  [{s.storage}] {s.type_}")
    print(f"    Performance: {s.performance}")
    print(f"    HA: {s.ha_ready}")
    print(f"    Best for: {s.best_for}")
    print(f"    Cost: {s.cost}")

Production Tips

# === Production Best Practices ===

@dataclass
class BestPractice:
    area: str
    practice: str
    why: str
    command: str

practices = [
    BestPractice("Network",
        "แยก Network สำหรับ Management, Storage, VM Traffic",
        "ป้องกัน Congestion ระหว่าง Traffic ประเภทต่างๆ",
        "สร้าง vmbr แยกสำหรับแต่ละ Network"),
    BestPractice("Backup",
        "Backup ทุกวัน ด้วย PBS เก็บ 30 วัน + offsite",
        "ป้องกัน Data Loss จาก Hardware Failure",
        "vzdump --all --mode snapshot --storage pbs"),
    BestPractice("Monitoring",
        "ใช้ Prometheus + Grafana monitor Cluster",
        "เห็นปัญหาก่อน Node ล่ม CPU RAM Disk Network",
        "pve-exporter + node-exporter + Grafana dashboard"),
    BestPractice("Update",
        "อัพเดท Proxmox ทีละ Node ใช้ Rolling Update",
        "ไม่ Downtime HA ย้าย VM ไป Node อื่นก่อน Update",
        "apt update && apt dist-upgrade (ทีละ Node)"),
    BestPractice("Security",
        "เปลี่ยน Port 8006, ใช้ 2FA, Firewall, SSH Key",
        "ป้องกัน Unauthorized Access",
        "pveum user modify root@pam -enable 1 + TOTP"),
]

print("=== Best Practices ===")
for p in practices:
    print(f"  [{p.area}] {p.practice}")
    print(f"    Why: {p.why}")
    print(f"    How: {p.command}")

เคล็ดลับ

  • Quorum: ใช้ 3 Node ขึ้นไปเสมอ สำหรับ HA ที่เชื่อถือได้
  • 10GbE: ใช้ 10GbE สำหรับ Storage/Ceph Network แตกต่างอย่างมาก
  • ZFS RAM: เผื่อ RAM สำหรับ ZFS ARC Cache 1GB ต่อ 1TB Storage
  • PBS: ใช้ Proxmox Backup Server ฟรี ดีกว่า vzdump ธรรมดา
  • Test: ทดสอบ HA Failover ทุก Quarter ตัด Node ดู VM ย้ายได้จริง

Proxmox VE คืออะไร

Open Source Virtualization KVM LXC Container Web UI Cluster HA ZFS Ceph Backup PBS Free License Homelab Enterprise