Network Monitoring Setup Guide — PRTG, Zabbix & Grafana

📋 In This Guide

1. Monitoring Platform Selection
2. SNMP Configuration on Devices
3. PRTG Setup & Auto-Discovery
4. Zabbix Deployment & Templates
5. Grafana Dashboard Setup
6. Syslog & Log Aggregation
7. Alert Design & Escalation
8. Reporting & SLA Tracking

Network monitoring transforms IT operations from reactive firefighting — discovering outages when users call — to proactive management where issues are detected and resolved before users notice. Without monitoring, IT teams are flying blind: unable to answer basic questions like "which link is saturated?", "when did the server go down?", or "how close are we to capacity?"

This guide walks through building a production monitoring platform from scratch — selecting the right tools for your scale, configuring SNMP on every device, deploying PRTG or Zabbix for polling, building Grafana dashboards for visualization, and designing an alerting system that pages the right person at the right time without alert fatigue.

1 Monitoring Platform Selection

No single monitoring tool is best for every organization. The right choice depends on your environment size, budget, technical depth, and what you primarily need to monitor — network devices, servers, applications, or all three.

Platform Comparison

Platform	Best For	Strengths	Limitations	Cost
PRTG Network Monitor	SMB to mid-market — Windows shops	Auto-discovery, 250+ sensor types, easy setup, great UI, built-in maps	License per sensor, expensive at scale, Windows-only server	Free (100 sensors) / ₹45,000+/year
Zabbix	Mid-market to enterprise — Linux shops	Free/open-source, highly scalable, powerful templates, active community	Steep learning curve, complex initial setup, UI less polished	100% Free (open-source) / Paid support available
LibreNMS	Network-focused monitoring	Auto-discovery, excellent vendor support (MikroTik, FortiGate, Cisco), free	Network devices primarily — limited server/app monitoring	100% Free (open-source)
Grafana + Prometheus	Metrics visualization layer	Best-in-class dashboards, works with any data source, alerting engine	Not a standalone NMS — requires backend data source (Zabbix, InfluxDB)	Free (OSS) / Grafana Cloud from $0
Nagios / Icinga2	Legacy environments	Mature, highly configurable, large plugin ecosystem	Configuration file-based — complex to manage at scale	Free (core) / Nagios XI paid
SolarWinds NPM	Large enterprise	Comprehensive, excellent maps, deep Cisco/Juniper integration	Very expensive, complex, SolarWinds supply chain breach history	₹500,000+/year

Recommended Stack by Organization Size

1–50 devices: PRTG free tier (100 sensors) or LibreNMS — zero cost, quick setup, covers all basics for small networks
50–500 devices: Zabbix (monitoring backend) + Grafana (dashboards) — scales well, free, powerful enough for complex environments
500+ devices: Zabbix with distributed proxies + Grafana Enterprise OR PRTG paid tier — enterprise scale, distributed collection, high availability
Network-heavy (ISPs, datacenters): LibreNMS for network devices + Zabbix for servers — best of both specialized tools

2 SNMP Configuration on Devices

SNMP (Simple Network Management Protocol) is the universal language of network monitoring — it allows monitoring platforms to query device metrics (CPU, memory, interface traffic, errors) without installing agents. Configuring SNMP correctly and securely on every device is the foundation of the entire monitoring platform.

SNMP Version Comparison

Version	Authentication	Encryption	Use
SNMPv1	Community string only	None	Never use — completely insecure
SNMPv2c	Community string only	None	Acceptable on isolated management VLAN only
SNMPv3	Username + auth password	AES-128/256	Always use for production

FortiGate SNMP Configuration

# FortiGate — SNMPv3 Configuration
config system snmp sysinfo
  set status enable
  set description "FortiGate-HQ-Firewall"
  set contact "noc@enterweb.in"
  set location "Server Room - Rack 2"
end

config system snmp community
  # SNMPv2c — management VLAN only (if SNMPv3 not supported by tool)
  edit 1
    set name "EnterWeb-NMS-Readonly"
    config hosts
      edit 1
        set ip 10.10.50.10 255.255.255.255    # Monitoring server IP only
      next
    end
    set query-v1-status disable
    set query-v2c-status enable
    set trap-v1-status disable
    set trap-v2c-status enable
    set trap-v2c-lport 162
  next
end

config system snmp user
  # SNMPv3 user
  edit "nms-readonly"
    set queries enable
    set query-port 161
    set auth-proto sha256
    set auth-pwd "StrongAuthPassword123!"
    set priv-proto aes256
    set priv-pwd "StrongPrivPassword456!"
    set security-level auth-priv
    set notify-hosts 10.10.50.10    # SNMP trap destination
  next
end

MikroTik SNMP Configuration

# MikroTik RouterOS — SNMP Setup
/snmp
set enabled=yes \
    contact="noc@enterweb.in" \
    location="Branch-Office-Router" \
    trap-version=2 \
    trap-community="EnterWeb-NMS-Readonly"

/snmp community
set [ find default=yes ] name="public" disabled=yes    # Disable default "public"
add name="EnterWeb-NMS-Readonly" \
    addresses=10.10.50.10/32 \
    read-access=yes \
    write-access=no \
    authentication-protocol=SHA1 \
    encryption-protocol=AES \
    authentication-password="StrongAuthPass!" \
    encryption-password="StrongEncPass!"

# Verify SNMP is responding
# From monitoring server: snmpwalk -v2c -c EnterWeb-NMS-Readonly 10.10.1.1

Ubuntu Server SNMP Agent

# Install SNMP daemon
sudo apt install snmpd snmp -y

# Configure /etc/snmp/snmpd.conf
sudo nano /etc/snmp/snmpd.conf

# Replace default content with:
# ── Access Control ──────────────────────────────────
agentaddress  udp:161

# SNMPv3 user (add via net-snmp-create-v3-user)
# Run BEFORE starting snmpd:
sudo systemctl stop snmpd
sudo net-snmp-create-v3-user -ro -A "StrongAuthPass!" -a SHA \
    -X "StrongPrivPass!" -x AES nms-readonly

# View-based access control
rocommunity  disabled                       # Disable v2c
rouser       nms-readonly  priv             # SNMPv3 only

# System info
sysLocation  "Datacenter-Rack3-Ubuntu-Server"
sysContact   "noc@enterweb.in"

# Extend with additional metrics
extend .1.3.6.1.4.1.2021.100 distro /usr/bin/distro

sudo systemctl enable snmpd
sudo systemctl start snmpd

# Allow SNMP through UFW from monitoring server only
sudo ufw allow from 10.10.50.10 to any port 161 proto udp

⚠️ Warning: Never use the community string "public" or "private" in production — these are the defaults that every network scanner and attacker tries first. Change community strings to random 20+ character strings and restrict SNMP access to only the monitoring server's IP address in the SNMP ACL. Place SNMP traffic on a dedicated management VLAN so it is never routed over untrusted network segments.

3 PRTG Setup & Auto-Discovery

PRTG Network Monitor provides the fastest path to a working monitoring platform — auto-discovery scans your network and creates devices and sensors automatically. The free tier supports 100 sensors, sufficient for monitoring 15–20 devices comprehensively.

PRTG Initial Setup Steps

Download PRTG from paessler.com — install on Windows Server 2019/2022 (minimum 4 vCPU, 8GB RAM, 100GB storage)
Access the web UI at https://[server-ip]:443 — default credentials are prtgadmin / prtgadmin — change immediately
Navigate to Setup → System Administration → Core & Probes — verify probe is connected
Add SNMP credentials: Setup → System Administration → SNMP Compatibility Options
Run auto-discovery: Devices → Add Device → Auto-Discovery — enter your management subnet (e.g., 10.10.0.0/24)
PRTG will scan the subnet and create device groups with pre-configured sensors automatically
Review discovered devices — remove duplicates, assign correct device types and icons
Configure notification contacts: Setup → Account Settings → Notifications

Key PRTG Sensors to Deploy per Device Type

Device Type	Essential Sensors	Sensor Count
Firewall (FortiGate)	Ping, SNMP CPU, SNMP Memory, SNMP Traffic (WAN + LAN), SNMP Sessions, HTTPS uptime	6–8 sensors
Router (MikroTik)	Ping, SNMP CPU, SNMP Memory, SNMP Traffic (all active interfaces), BGP peer state	5–10 sensors
Switch (managed)	Ping, SNMP Traffic (uplinks), SNMP Port Errors, SNMP STP state	4–6 sensors
Windows Server	Ping, WMI CPU, WMI Memory, WMI Disk space (all volumes), WMI Services (critical), WMI Event Log	8–12 sensors
Linux Server	Ping, SSH CPU, SSH Memory, SSH Disk, SSH Process count, SNMP Load average	6–8 sensors
Internet link (ILL)	Ping (external target), HTTP(S) check, SNMP Traffic on WAN interface, latency probe	4 sensors

✅ Pro Tip: In PRTG, use Device Templates to standardize sensor sets across device types — create one template for "MikroTik Router", one for "Windows Server", one for "FortiGate Firewall." When you add a new device, apply the matching template to automatically create all required sensors in seconds. This eliminates manual sensor creation and ensures consistent monitoring coverage across all devices of the same type.

4 Zabbix Deployment & Templates

Zabbix is the most powerful free network monitoring platform — highly scalable, template-driven, and capable of monitoring tens of thousands of devices with a single installation. The initial setup is more involved than PRTG but the depth of capability and zero licensing cost make it the preferred choice for growing environments.

Zabbix Server Installation (Ubuntu 22.04)

# Install Zabbix 7.x on Ubuntu 22.04
# Step 1: Add Zabbix repository
wget https://repo.zabbix.com/zabbix/7.0/ubuntu/pool/main/z/zabbix-release/zabbix-release_7.0-1+ubuntu22.04_all.deb
sudo dpkg -i zabbix-release_7.0-1+ubuntu22.04_all.deb
sudo apt update

# Step 2: Install Zabbix server, frontend, agent
sudo apt install zabbix-server-mysql zabbix-frontend-php \
    zabbix-apache-conf zabbix-sql-scripts zabbix-agent2 -y

# Step 3: Configure MySQL database
sudo mysql -uroot -p
  CREATE DATABASE zabbix CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
  CREATE USER 'zabbix'@'localhost' IDENTIFIED BY 'StrongDBPassword123!';
  GRANT ALL PRIVILEGES ON zabbix.* TO 'zabbix'@'localhost';
  SET GLOBAL log_bin_trust_function_creators = 1;
  FLUSH PRIVILEGES; EXIT;

# Step 4: Import initial schema
zcat /usr/share/zabbix-sql-scripts/mysql/server.sql.gz | \
    mysql --default-character-set=utf8mb4 -uzabbix -p zabbix

# Step 5: Configure Zabbix server (/etc/zabbix/zabbix_server.conf)
DBHost=localhost
DBName=zabbix
DBUser=zabbix
DBPassword=StrongDBPassword123!
StartPollers=20
StartPollersUnreachable=5
StartTrappers=10
StartPingers=10
CacheSize=128M
HistoryCacheSize=64M
TrendCacheSize=32M

# Step 6: Start services
sudo systemctl enable --now zabbix-server zabbix-agent2 apache2
sudo systemctl status zabbix-server

# Step 7: Access web UI
# http://[server-ip]/zabbix
# Default login: Admin / zabbix — CHANGE IMMEDIATELY

Zabbix Templates for Network Devices

# Zabbix ships with pre-built templates — enable via:
# Configuration → Templates → Search for your device vendor

# Key built-in templates to activate:
"Template Net Cisco IOS SNMPv2"        → Cisco routers/switches
"Template Net Fortinet FortiGate SNMPv2" → FortiGate firewalls
"Template Net MikroTik SNMPv2"         → MikroTik RouterOS
"Template OS Linux by Zabbix agent"    → Linux servers
"Template OS Windows by Zabbix agent"  → Windows servers
"Template App Apache by HTTP"          → Web servers
"Template App MySQL by Zabbix agent"   → MySQL/MariaDB

# Assign template to a host:
# Configuration → Hosts → [select host] → Templates tab
# → Start typing template name → Select → Update

# Custom MikroTik template items (if built-in missing metrics):
# Configuration → Templates → Create item
Name:     WAN Interface Traffic In
Type:     SNMP agent
OID:      .1.3.6.1.2.1.31.1.1.1.6.1   (ifHCInOctets for interface index 1)
Key:      net.if.in[wan1]
Type of info: Numeric (unsigned)
Units:    bps
Preprocessing: Change per second → Multiplier (8)  [bytes to bits]

✅ Pro Tip: Use Zabbix Proxies for monitoring remote sites — deploy a lightweight Zabbix Proxy at each branch office. The proxy collects data locally and forwards it to the central Zabbix server, reducing WAN bandwidth consumption by 90% compared to the server polling remote devices directly. Proxies also continue collecting data during WAN outages and sync when connectivity is restored — ensuring no monitoring gaps during the exact events you most need data about.

5 Grafana Dashboard Setup

Grafana transforms raw monitoring data into beautiful, interactive dashboards that operations teams and management can actually read. It connects to Zabbix, InfluxDB, Prometheus, and dozens of other data sources — acting as a unified visualization layer across your entire monitoring stack.

Grafana Installation (Ubuntu)

# Install Grafana OSS
sudo apt install -y apt-transport-https software-properties-common
wget -q -O - https://apt.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://apt.grafana.com stable main" | \
    sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update && sudo apt install grafana -y

sudo systemctl enable --now grafana-server
# Access: http://[server-ip]:3000
# Default: admin / admin — change on first login

# Install Zabbix data source plugin
grafana-cli plugins install alexanderzobnin-zabbix-app
sudo systemctl restart grafana-server

# Enable Zabbix plugin:
# Plugins → alexanderzobnin-zabbix-app → Enable
# Configuration → Data Sources → Add → Zabbix
# URL: http://localhost/zabbix/api_jsonrpc.php
# Username: Admin / [your zabbix password]

Essential Grafana Dashboards to Build

Network Operations Center (NOC) Overview: Full-screen TV dashboard — all devices listed with green/red status indicators, current WAN bandwidth utilization, active alerts count, and uptime percentage for the day
WAN Bandwidth Dashboard: Time-series graphs for each internet link — inbound/outbound Mbps, utilization %, peak traffic times, and 30-day trend comparison
Server Health Dashboard: CPU, memory, and disk utilization for all servers in a grid — color-coded thresholds (green <70%, yellow 70–85%, red >85%)
Top Talkers Dashboard: Which IP addresses or interfaces are consuming the most bandwidth — updated every 5 minutes, sortable table
Monthly Executive Report Dashboard: Uptime SLA percentage, average response times, total alerts fired, top 5 recurring issues — export as PDF for management
VPN Tunnel Status: All site-to-site VPN tunnels listed with up/down status, tunnel uptime, bytes transferred — instant visibility into branch connectivity

# Sample Grafana panel query (Zabbix datasource — WAN bandwidth)
# Panel type: Time series
# Data source: Zabbix

Group:   Network Devices
Host:    FortiGate-HQ
Application: Network Interfaces
Item:    WAN1: Bits received per second

# Add second query for outbound:
Item:    WAN1: Bits sent per second

# Panel display settings:
Unit:    bits/sec (auto-scale to Mbps/Gbps)
Fill opacity: 10
Line width: 2
Thresholds:
  70% of link capacity → Yellow
  90% of link capacity → Red

6 Syslog & Log Aggregation

SNMP polling tells you metrics — syslog tells you events. When a firewall blocks a connection, a VPN tunnel drops, an interface flaps, or a login fails, the device sends a syslog message. Collecting and centralizing these logs is essential for troubleshooting and security monitoring.

Syslog Server Setup (rsyslog on Ubuntu)

# Install and configure rsyslog as central syslog server
sudo apt install rsyslog -y

# Edit /etc/rsyslog.conf — enable UDP and TCP syslog reception
# Uncomment these lines:
module(load="imudp")
input(type="imudp" port="514")

module(load="imtcp")
input(type="imtcp" port="514")

# Route logs by source IP to separate files
# Add to /etc/rsyslog.d/10-network-devices.conf:

if $fromhost-ip == '10.10.1.1' then /var/log/network/fortigate-hq.log
if $fromhost-ip == '10.10.1.2' then /var/log/network/mikrotik-core.log
if $fromhost-ip startswith '10.10.' then /var/log/network/network-devices.log
& stop

# Create log directory and set permissions
sudo mkdir -p /var/log/network
sudo chown syslog:adm /var/log/network

# Configure log rotation (/etc/logrotate.d/network-devices)
/var/log/network/*.log {
    daily
    rotate 90
    compress
    delaycompress
    missingok
    notifempty
    sharedscripts
    postrotate
        /usr/bin/systemctl reload rsyslog > /dev/null 2>&1 || true
    endscript
}

sudo systemctl restart rsyslog

# Allow syslog through firewall (from network devices only)
sudo ufw allow from 10.10.0.0/16 to any port 514

Configure FortiGate Syslog Forwarding

# FortiGate — Send all logs to central syslog server
config log syslogd setting
  set status enable
  set server 10.10.50.10       # Syslog server IP
  set mode udp
  set port 514
  set facility local7
  set source-ip 10.10.1.1      # FortiGate management IP
  set format rfc5424            # Standard syslog format
end

config log syslogd filter
  set severity information      # Send info level and above
  set forward-traffic enable
  set local-traffic enable
  set sniffer-traffic disable
  set anomaly enable
  set voip disable
  set gtp disable
  set filter-type include
end

# Verify logs are being sent
diagnose log test

✅ Pro Tip: For organizations that need to search and analyze logs at scale — install the ELK Stack (Elasticsearch, Logstash, Kibana) or the lighter Graylog on top of your syslog collection. These platforms parse structured syslog data, enable full-text search across millions of log entries in seconds, and let you build dashboards showing security events, top blocked IPs, VPN tunnel events, and authentication failures — turning raw syslog into actionable intelligence.

7 Alert Design & Escalation

Alert fatigue is the most common failure mode of monitoring deployments — too many low-priority alerts train teams to ignore them, including the critical ones. Good alert design means alerting only on actionable conditions, with the right severity, to the right person, at the right time.

Alert Threshold Reference

Metric	Warning Threshold	Critical Threshold	Check Interval
Device ping (packet loss)	> 5% for 3 min	> 30% for 1 min / down 2 min	60 sec
WAN bandwidth utilization	> 70% sustained 5 min	> 90% sustained 2 min	60 sec
CPU utilization (router/firewall)	> 75% for 5 min	> 90% for 3 min	60 sec
Server CPU utilization	> 80% for 10 min	> 95% for 5 min	60 sec
Server memory utilization	> 85% for 5 min	> 95% for 2 min	60 sec
Disk space used	> 75% capacity	> 90% capacity	5 min
VPN tunnel status	N/A	Tunnel down > 2 min	30 sec
Interface error rate	> 0.1% error rate	> 1% error rate	60 sec
SSL certificate expiry	30 days remaining	7 days remaining	Daily

Escalation Matrix

# Alert Escalation Policy — define in PRTG/Zabbix notification settings

Level 1 — Warning (Yellow):
  Notify:    NOC email group (noc@enterweb.in)
  Method:    Email
  Delay:     Immediate (on first trigger)
  Repeat:    Every 30 minutes while active

Level 2 — Critical (Red):
  Notify:    On-call engineer
  Method:    Email + SMS/WhatsApp via Twilio/WATI API
  Delay:     Immediate
  Repeat:    Every 15 minutes while active

Level 3 — Critical unacknowledged > 30 min:
  Notify:    IT Manager + On-call engineer
  Method:    Phone call (Twilio voice alert)
  Delay:     30 minutes after Level 2 trigger
  Repeat:    Every 30 minutes

Level 4 — Major outage (core device down > 1 hour):
  Notify:    IT Director + All stakeholders
  Method:    Email + Phone + Incident ticket auto-created
  Delay:     60 minutes after initial trigger

⚠️ Warning: Sending SNMP or syslog alerts via email only is insufficient for critical infrastructure — email delivery can be delayed 5–15 minutes due to spam filtering and MX propagation. For Critical-level alerts (device down, WAN outage, VPN failure), configure a secondary notification channel: SMS via Twilio, WhatsApp Business API (WATI), Telegram bot, or PagerDuty integration. The goal is to wake someone up at 3 AM within 2 minutes of a critical failure — email alone will not reliably achieve this.

8 Reporting & SLA Tracking

Monitoring data becomes a business asset when it is turned into regular reports — proving SLA compliance to clients, identifying recurring problems for proactive resolution, and demonstrating the value of IT investments to management.

Monthly Report Contents

Uptime SLA report: Per-device availability percentage for the month — export from PRTG Availability Report or Zabbix Report → Availability report. Target: 99.9% (8.7 hours downtime/year)
WAN utilization trends: Average and peak bandwidth per link, growth trend vs. last month, capacity planning recommendation if utilization consistently exceeds 60%
Alert summary: Total alerts fired by severity, top 10 most-alerting devices, mean time to acknowledge (MTTA) and mean time to resolve (MTTR)
Top issues: Recurring alerts — devices that triggered the same alert 3+ times indicate an underlying problem needing permanent resolution, not repeated acknowledgement
Patch compliance: Devices with outdated firmware or OS patches — flagged for remediation in the following month
Capacity forecast: Based on 3-month growth trend — predict when each WAN link, server disk, or device CPU will hit critical threshold

Automated PRTG PDF Report

# PRTG — Schedule monthly PDF report
# Setup → Reports → Add Report

Report type:    Custom Report
Schedule:       Monthly (1st of each month, 08:00)
Output:         PDF + Email to: management@enterweb.in

# Sections to include:
- Summary: Total sensors, down sensors, uptime %
- Top 10 sensors by downtime
- Bandwidth graphs: All WAN interfaces (last 30 days)
- Server resources: CPU/Memory/Disk (last 30 days)
- Alert log: All Critical alerts last 30 days

# PRTG API — pull uptime data programmatically
GET https://[prtg-server]/api/table.json?
    content=sensors&
    columns=device,sensor,status,uptime,downtime&
    filter_status=5&     # 5 = Down sensors
    apitoken=[your-api-token]

✅ Pro Tip: Create a dedicated NOC Dashboard TV screen in your server room or IT office — a Grafana dashboard displayed on a wall-mounted monitor showing real-time device status, current WAN utilization graphs, active alert count, and the last 10 alert events. This passive visibility means your team instantly notices a spike or outage without needing to actively check the monitoring console — the screen catches issues in peripheral vision during normal work hours, dramatically reducing mean time to detect (MTTD).

Need Help Setting Up Network Monitoring?

EnterWeb IT Firm deploys and configures PRTG, Zabbix, Grafana, and LibreNMS monitoring platforms for organizations of all sizes. We design alert escalation workflows, build custom dashboards, and deliver monthly SLA reports so your IT team always has complete visibility into infrastructure health.

Schedule Free Consultation View Managed Services