SFTP Gateway Log Observability: Azure Monitor Integration Guide
TLDR
- Forward Azure Monitor logs to Grafana/Loki for enhanced search and visualization
- Ideal for HA deployments with multiple instances behind a load balancer
- Uses an Azure Function to forward logs from Event Hub to Loki with VM name labels
- Enables cross-instance log search and upload tracking per VM
Introduction
If your SFTP Gateway is already sending logs to Azure Log Analytics, you can add powerful search and visualization capabilities without changing your existing logging setup. This guide shows you how to forward Log Analytics data to a Grafana/Loki stack for enhanced observability.
This approach is ideal when:
- You have multiple SFTP Gateway instances behind a load balancer (especially HA deployments on Virtual Machine Scale Sets)
- You want to centralize logs from multiple HA instances into one dashboard
- You prefer to keep Log Analytics as your primary log storage while adding Grafana for visualization
- You need cross-instance search capabilities (e.g., "find all transfers of invoice.pdf across all VMs")
- You want to see which VM handled each file transfer in an HA deployment
By the end of this guide, you'll be able to:
- Search across all SFTP Gateway instances from a single Grafana dashboard
- Track uploads by VM to verify load balancing is working
- Correlate activity across multiple instances
- Keep your existing Log Analytics setup intact (logs flow to both systems)
- Query logs using LogQL for advanced analysis
Architecture Overview
The integration uses a Log Analytics Data Export Rule to stream logs through Event Hub to a Python Azure Function, which forwards them to Loki in real-time:
┌────────────────────────────────────────────────────────────────────────────┐
│ Azure Subscription │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ SFTP Gateway HA Deployment (Virtual Machine Scale Set) │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ sftpgw-000 │ │ sftpgw-001 │ │ sftpgw-00N │ │ │
│ │ │ (VM 0) │ │ (VM 1) │ │ (VM N) │ │ │
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │
│ │ │ │ │ │ │
│ │ └───────────────────┼───────────────────┘ │ │
│ │ │ │ │
│ │ Azure Monitor Agent (Diagnostic Settings) │ │
│ └─────────────────────────────┼────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Log Analytics Workspace │ │
│ │ │ │
│ │ Custom Tables: │ │
│ │ ┌──────────────────────┐ ┌──────────────────────────┐ │ │
│ │ │ SFTPGWAudit_CL │ │ SFTPGWApplication_CL │ │ │
│ │ │ (per-VM log entries) │ │ (per-VM log entries) │ │ │
│ │ └──────────┬───────────┘ └──────────┬───────────────┘ │ │
│ │ └──────────────────────────┘ │ │
│ │ │ │ │
│ │ Data Export Rule │ │
│ └─────────────────────────┼────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Event Hub Namespace │ │
│ │ │ │
│ │ ┌──────────────────────┐ ┌──────────────────────────┐ │ │
│ │ │ am-sftpgwaudit-cl │ │ am-sftpgwapplication-cl │ │ │
│ │ │ (auto-created) │ │ (auto-created) │ │ │
│ │ └──────────┬───────────┘ └──────────┬───────────────┘ │ │
│ │ └──────────────────────────┘ │ │
│ └─────────────────────────┼────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────┐ │
│ │ Azure Function │ │
│ │ EventHubToLoki │ │
│ │ │ │
│ │ - Extracts VM name │ │
│ │ - Forwards to Loki │ │
│ └───────────┬───────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Observability VM │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Loki │◀────────│ Grafana │◀─── You (browser) │ │
│ │ │ (store) │ │ (visualize) │ │ │
│ │ │ :3100 │ │ :3000 │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────────────┘
How It Works
- SFTP Gateway VMs write audit and application logs, collected by the Azure Monitor Agent
- Log Analytics Workspace stores logs in custom tables (
SFTPGWAudit_CL,SFTPGWApplication_CL) with aComputerfield identifying each VM - Data Export Rule streams new log entries to Event Hub in real-time
- Event Hub buffers log events (auto-created hubs named
am-sftpgwaudit-clandam-sftpgwapplication-cl) - Azure Function reads Event Hub messages, extracts the VM name from each record, and forwards to Loki with appropriate labels
- Loki stores the logs with labels for efficient querying
- Grafana provides the search and dashboard interface, including "Uploads by VM" visualization
Component Details
| Component | Purpose | Notes |
|---|---|---|
| Log Analytics Workspace | Existing log storage | No changes required to your SFTP Gateway config |
| Data Export Rule | Streams logs to Event Hub | Built-in LAW feature, no custom code |
| Event Hub | Durable message buffer | Hubs auto-created by the export rule |
| Azure Function | Log forwarder | Python function, Event Hub trigger |
| Loki | Log aggregation | Stores logs from all VMs with vm_name label |
| Grafana | Visualization | Dashboard shows uploads by VM for HA verification |
Labels Added by the Function
The Azure Function adds these labels to each log entry:
| Label | Description | Example |
|---|---|---|
job | Fixed identifier | sftpgw |
vm_name | VM hostname (for HA tracking) | sftpgw-000000 |
log_type | Log category | audit or application |
source | Ingestion path identifier | azure-monitor |
Prerequisites
Before starting, ensure you have:
- SFTP Gateway logs in Log Analytics — Your instances should already be configured to send logs to a Log Analytics Workspace (see Azure Log Streaming)
- Azure CLI installed and configured with appropriate permissions
- A Linux VM for running Loki/Grafana (Standard_B2s or larger recommended)
- Permissions to:
- Create Event Hub namespaces
- Create Log Analytics Data Export Rules
- Create Azure Functions
- Create VMs and NSG rules
Installation Steps
Step 1: Deploy the Observability Stack
Launch a Linux VM to run Loki and Grafana. You can use any Ubuntu 24.04 LTS image.
What you're doing: Creating a VM with Docker, Loki, and Grafana pre-configured.
Option A: Using Cloud-Init (Recommended)
Create a VM with the following cloud-init configuration. This automatically installs and configures everything on first boot.
VM requirements:
- Image: Ubuntu 24.04 LTS
- Size: Standard_B2s or larger
- Storage: 30GB
- NSG: Allow inbound on ports 22 (SSH), 3100 (Loki), and 3000 (Grafana)
Save the following as cloud-init.yaml:
#cloud-config
package_update: true
packages:
- docker.io
- docker-compose-v2
runcmd:
- systemctl enable docker
- systemctl start docker
- mkdir -p /opt/observability/loki
- mkdir -p /opt/observability/grafana/provisioning/datasources
# Create Loki config
- |
cat > /opt/observability/loki/loki-config.yaml << 'EOF'
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
instance_addr: 127.0.0.1
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
schema_config:
configs:
- from: 2020-10-24
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://localhost:9093
limits_config:
reject_old_samples: false
reject_old_samples_max_age: 168h
ingestion_rate_mb: 16
ingestion_burst_size_mb: 32
retention_period: 744h
EOF
# Create Grafana datasource config
- |
cat > /opt/observability/grafana/provisioning/datasources/datasources.yaml << 'EOF'
apiVersion: 1
datasources:
- name: Loki
type: loki
access: proxy
url: http://loki:3100
isDefault: true
EOF
# Create Docker Compose file
- |
cat > /opt/observability/docker-compose.yml << 'EOF'
services:
loki:
image: grafana/loki:3.0.0
container_name: loki
ports:
- "0.0.0.0:3100:3100"
volumes:
- ./loki/loki-config.yaml:/etc/loki/local-config.yaml:ro
- loki-data:/loki
command: -config.file=/etc/loki/local-config.yaml
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:3100/ready"]
interval: 15s
timeout: 5s
retries: 5
grafana:
image: grafana/grafana:11.2.0
container_name: grafana
ports:
- "0.0.0.0:3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_AUTH_ANONYMOUS_ENABLED=false
volumes:
- ./grafana/provisioning:/etc/grafana/provisioning:ro
- grafana-data:/var/lib/grafana
depends_on:
- loki
restart: unless-stopped
volumes:
loki-data:
grafana-data:
EOF
- cd /opt/observability && docker compose up -d
Deploy the VM:
RESOURCE_GROUP="sftpgw-observability-rg"
LOCATION="eastus"
VM_NAME="sftpgw-obs-vm"
# Create resource group
az group create --name $RESOURCE_GROUP --location $LOCATION
# Create VM with cloud-init
az vm create \
--resource-group $RESOURCE_GROUP \
--name $VM_NAME \
--image Canonical:ubuntu-24_04-lts:server:latest \
--size Standard_B2s \
--admin-username azureuser \
--generate-ssh-keys \
--custom-data cloud-init.yaml \
--output json
# Get the public IP
OBS_VM_IP=$(az vm list-ip-addresses \
--name $VM_NAME \
--resource-group $RESOURCE_GROUP \
--query "[0].virtualMachine.network.publicIpAddresses[0].ipAddress" \
--output tsv)
echo "Observability VM IP: $OBS_VM_IP"
Option B: Manual Installation
If you prefer to set up an existing VM manually, SSH in and run the commands from the cloud-init script above.
Step 2: Configure Network Security Group
The VM needs to accept connections from the Azure Function on port 3100.
What you're doing: Allowing the Azure Function to push logs to Loki, and opening Grafana for your access.
# Get the NSG name
NSG_NAME=$(az vm show \
--name $VM_NAME \
--resource-group $RESOURCE_GROUP \
--query "networkProfile.networkInterfaces[0].id" -o tsv \
| xargs -I {} az network nic show --ids {} \
--query "networkSecurityGroup.id" -o tsv \
| xargs -I {} basename {})
# Allow Loki port (needed for Azure Function to push logs)
az network nsg rule create \
--resource-group $RESOURCE_GROUP \
--nsg-name $NSG_NAME \
--name AllowLoki \
--priority 1010 \
--destination-port-ranges 3100 \
--protocol Tcp \
--access Allow \
--output none
# Allow Grafana port for your access
az network nsg rule create \
--resource-group $RESOURCE_GROUP \
--nsg-name $NSG_NAME \
--name AllowGrafana \
--priority 1020 \
--destination-port-ranges 3000 \
--protocol Tcp \
--access Allow \
--output none
Security Note: For production, consider:
- Placing the Function and VM in the same VNet with private endpoints
- Restricting Grafana access to your IP range using
--source-address-prefixes - Adding an Application Gateway with authentication in front of Grafana
Step 3: Create the Event Hub Namespace
Create an Event Hub Namespace to receive logs from the Log Analytics Data Export Rule. You do not need to create individual Event Hubs — they are auto-created by the export rule.
What you're doing: Creating the message bus that bridges Log Analytics to the Azure Function.
EVENTHUB_NS="sftpgw-loki-ehns"
az eventhubs namespace create \
--name $EVENTHUB_NS \
--resource-group $RESOURCE_GROUP \
--location $LOCATION \
--sku Standard \
--output none
echo "Event Hub Namespace created: $EVENTHUB_NS"
Step 4: Create the Log Analytics Data Export Rule
Configure Log Analytics to stream your SFTP Gateway custom tables to Event Hub in real-time.
What you're doing: Setting up continuous export of log data so new entries are automatically forwarded.
# Your existing Log Analytics Workspace name and resource group
LAW_NAME="<YOUR-LOG-ANALYTICS-WORKSPACE-NAME>"
LAW_RG="<YOUR-LAW-RESOURCE-GROUP>"
# Get the Event Hub Namespace resource ID
EVENTHUB_NS_ID=$(az eventhubs namespace show \
--name $EVENTHUB_NS \
--resource-group $RESOURCE_GROUP \
--query id -o tsv)
# Create export rule for SFTP Gateway log tables
az monitor log-analytics workspace data-export create \
--resource-group $LAW_RG \
--workspace-name $LAW_NAME \
--name "sftpgw-to-eventhub" \
--tables SFTPGWAudit_CL SFTPGWApplication_CL \
--destination "$EVENTHUB_NS_ID" \
--output none
echo "Data export rule created — logs will stream to Event Hub"
echo "Event Hubs will be auto-created: am-sftpgwaudit-cl, am-sftpgwapplication-cl"
Note: It may take up to 30 minutes for the first logs to appear in Event Hub after creating the export rule.
Step 5: Create the Azure Function
Create a Python Azure Function that reads from Event Hub and forwards logs to Loki. The function extracts the VM name from each log record's Computer field.
What you're doing: Creating a serverless function that transforms Log Analytics export data into Loki's format.
Create the following directory structure:
function-app/
├── host.json
├── requirements.txt
├── eventhub_to_loki/
│ ├── __init__.py
│ └── function.json
└── eventhub_app_to_loki/
├── __init__.py
└── function.json
host.json:
{
"version": "2.0",
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[4.*, 5.0.0)"
},
"logging": {
"logLevel": {
"default": "Information"
}
}
}
requirements.txt:
azure-functions
eventhub_to_loki/__init__.py (audit log forwarder):
import json
import logging
import os
import typing
from datetime import datetime, timezone
from urllib.request import Request, urlopen
import azure.functions as func
LOKI_URL = os.environ.get("LOKI_URL", "http://localhost:3100")
def main(events: typing.List[func.EventHubEvent]):
"""Forward SFTP Gateway audit logs from Event Hub to Loki."""
streams_by_label = {}
for event in events:
try:
body = event.get_body().decode("utf-8")
wrapper = json.loads(body)
except Exception as e:
logging.warning("Failed to parse event body: %s", e)
continue
for record in wrapper.get("records", []):
raw_data = record.get("RawData", "")
if not raw_data:
continue
vm_name = record.get("Computer", "unknown")
table_name = record.get("Type", "unknown")
log_type = "audit" if "audit" in table_name.lower() else "application"
# Build label key for grouping
label_key = (vm_name, log_type)
if label_key not in streams_by_label:
streams_by_label[label_key] = []
# Convert TimeGenerated to nanosecond timestamp
time_str = record.get("TimeGenerated", "")
try:
dt = datetime.fromisoformat(time_str.replace("Z", "+00:00"))
ts_ns = str(int(dt.timestamp() * 1e9))
except Exception:
ts_ns = str(int(datetime.now(timezone.utc).timestamp() * 1e9))
streams_by_label[label_key].append([ts_ns, raw_data])
if not streams_by_label:
return
# Build Loki payload
streams = []
for (vm_name, log_type), values in streams_by_label.items():
streams.append({
"stream": {
"job": "sftpgw",
"vm_name": vm_name,
"log_type": log_type,
"source": "azure-monitor",
},
"values": values,
})
payload = json.dumps({"streams": streams}).encode("utf-8")
url = f"{LOKI_URL}/loki/api/v1/push"
req = Request(url, data=payload, method="POST")
req.add_header("Content-Type", "application/json")
try:
with urlopen(req, timeout=10) as resp:
total = sum(len(v) for v in streams_by_label.values())
logging.info("Sent %d events to Loki, status: %s", total, resp.status)
except Exception as e:
logging.error("Error sending to Loki: %s", e)
raise
eventhub_to_loki/function.json:
{
"scriptFile": "__init__.py",
"bindings": [
{
"type": "eventHubTrigger",
"name": "events",
"direction": "in",
"eventHubName": "am-sftpgwaudit-cl",
"connection": "EVENTHUB_CONNECTION",
"cardinality": "many",
"consumerGroup": "$Default",
"dataType": "string"
}
]
}
eventhub_app_to_loki/__init__.py:
This function is identical to the audit forwarder above — copy the same __init__.py file.
eventhub_app_to_loki/function.json:
{
"scriptFile": "__init__.py",
"bindings": [
{
"type": "eventHubTrigger",
"name": "events",
"direction": "in",
"eventHubName": "am-sftpgwapplication-cl",
"connection": "EVENTHUB_CONNECTION",
"cardinality": "many",
"consumerGroup": "$Default",
"dataType": "string"
}
]
}
Step 6: Deploy the Azure Function
What you're doing: Creating the Function App, configuring it with the Event Hub connection and Loki URL, and deploying the code.
FUNCTION_APP="sftpgw-loki-forwarder"
FUNC_STORAGE="sftpgwfunc$(openssl rand -hex 3)"
# Create storage account for the Function App
az storage account create \
--name $FUNC_STORAGE \
--resource-group $RESOURCE_GROUP \
--location $LOCATION \
--sku Standard_LRS \
--kind StorageV2 \
--output none
# Create the Function App
az functionapp create \
--name $FUNCTION_APP \
--resource-group $RESOURCE_GROUP \
--storage-account $FUNC_STORAGE \
--consumption-plan-location $LOCATION \
--runtime python \
--runtime-version 3.11 \
--functions-version 4 \
--os-type Linux \
--output none
# Get Event Hub connection string
EVENTHUB_CONN=$(az eventhubs namespace authorization-rule keys list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUB_NS \
--name RootManageSharedAccessKey \
--query primaryConnectionString -o tsv)
# Configure app settings
az functionapp config appsettings set \
--name $FUNCTION_APP \
--resource-group $RESOURCE_GROUP \
--settings \
"LOKI_URL=http://${OBS_VM_IP}:3100" \
"EVENTHUB_CONNECTION=$EVENTHUB_CONN" \
--output none
# Deploy the function code
cd function-app
func azure functionapp publish $FUNCTION_APP --python
Verification
Step 7: Verify Logs Are Flowing
What you're doing: Confirming that logs are being forwarded from Log Analytics through Event Hub to Loki.
Generate Test Activity
Trigger some SFTP activity on your gateway (connect, upload a file, disconnect). This creates new log entries that flow through the pipeline.
# Example: upload test files via SFTP
for i in {1..5}; do
echo "Test upload $i - $(date)" > /tmp/test-$i.txt
sftp -o StrictHostKeyChecking=no user@your-sftp-gateway.example.com <<EOF
put /tmp/test-$i.txt
quit
EOF
sleep 2
done
Check Loki Labels
# Query Loki to verify labels are present
curl -s "http://${OBS_VM_IP}:3100/loki/api/v1/labels" | python3 -m json.tool
Expected output includes labels like job, vm_name, log_type, source.
Query Logs via Loki API
# Query logs for the sftpgw job
curl -s "http://${OBS_VM_IP}:3100/loki/api/v1/query_range" \
--data-urlencode 'query={job="sftpgw"}' \
--data-urlencode "start=$(date -d '1 hour ago' +%s 2>/dev/null || date -v-1H +%s)" \
--data-urlencode "end=$(date +%s)" \
--data-urlencode "limit=5" | python3 -m json.tool
Access Grafana
- Open your browser to
http://<OBS-VM-IP>:3000 - Log in with username
adminand passwordadmin(or the password you configured) - Go to Explore (compass icon in the left sidebar)
- Select Loki as the data source
- Run a query:
{job="sftpgw"}
You should see your SFTP Gateway logs with vm_name labels.
Using Grafana
Querying Azure Monitor-Sourced Logs
Logs forwarded by the Azure Function have these labels:
| Label | Description | Example |
|---|---|---|
job | Fixed identifier for SFTP Gateway | sftpgw |
vm_name | VM hostname (for HA tracking) | sftpgw-000000 |
log_type | Log category | audit or application |
source | Ingestion path identifier | azure-monitor |
Example Queries
View all logs from all SFTP Gateway VMs:
{job="sftpgw"}
View logs from a specific VM:
{job="sftpgw", vm_name="sftpgw-000000"}
Search for a filename across all VMs:
{job="sftpgw"} |= "invoice.pdf"
View only audit logs:
{job="sftpgw", log_type="audit"}
Find file uploads:
{job="sftpgw"} |= "SFTP_FILE_UPLOAD_COMPLETE"
Count uploads per VM (verify load balancing):
sum by (vm_name) (
count_over_time({job="sftpgw"} |= "SFTP_FILE_UPLOAD_COMPLETE" [1h])
)
Find failed authentication attempts:
{job="sftpgw"} |= "USERAUTH_FAILURE"
Extract fields from JSON logs:
{job="sftpgw"} |= "SFTP_FILE_UPLOAD_COMPLETE" | json | line_format "{{.username}} uploaded {{.file_name}}"
Creating an HA Dashboard
For HA deployments, create a dashboard that shows upload distribution across VMs.
- In Grafana, go to Dashboards → New → Import
- Paste the following dashboard JSON:
{
"uid": "sftp-ha-overview",
"title": "SFTP Gateway HA Overview",
"tags": ["sftp", "ha", "azure-monitor"],
"timezone": "browser",
"schemaVersion": 39,
"refresh": "10s",
"time": {
"from": "now-1h",
"to": "now"
},
"panels": [
{
"id": 1,
"title": "Uploads by VM",
"description": "Shows how uploads are distributed across HA instances",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 },
"datasource": { "type": "loki", "uid": "loki" },
"targets": [
{
"expr": "sum by (vm_name) (count_over_time({job=\"sftpgw\"} |= \"SFTP_FILE_UPLOAD_COMPLETE\" [1m]))",
"legendFormat": "{{vm_name}}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"unit": "short"
}
},
"options": {
"legend": { "displayMode": "list", "placement": "bottom" }
}
},
{
"id": 2,
"title": "Total Uploads by VM",
"description": "Pie chart showing upload distribution",
"type": "piechart",
"gridPos": { "h": 8, "w": 6, "x": 12, "y": 0 },
"datasource": { "type": "loki", "uid": "loki" },
"targets": [
{
"expr": "sum by (vm_name) (count_over_time({job=\"sftpgw\"} |= \"SFTP_FILE_UPLOAD_COMPLETE\" [$__range]))",
"legendFormat": "{{vm_name}}",
"refId": "A"
}
],
"options": {
"legend": {
"displayMode": "table",
"placement": "right",
"values": ["value", "percent"]
}
}
},
{
"id": 3,
"title": "Active VMs",
"type": "stat",
"gridPos": { "h": 4, "w": 6, "x": 18, "y": 0 },
"datasource": { "type": "loki", "uid": "loki" },
"targets": [
{
"expr": "count(sum by (vm_name) (count_over_time({job=\"sftpgw\"} [$__range])))",
"legendFormat": "VMs",
"refId": "A"
}
],
"options": {
"colorMode": "value",
"graphMode": "none"
}
},
{
"id": 4,
"title": "Total Events",
"type": "stat",
"gridPos": { "h": 4, "w": 6, "x": 18, "y": 4 },
"datasource": { "type": "loki", "uid": "loki" },
"targets": [
{
"expr": "sum(count_over_time({job=\"sftpgw\"} [$__range]))",
"legendFormat": "Events",
"refId": "A"
}
],
"options": {
"colorMode": "value",
"graphMode": "area"
}
},
{
"id": 5,
"title": "Log Volume by VM",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 8 },
"datasource": { "type": "loki", "uid": "loki" },
"targets": [
{
"expr": "sum by (vm_name) (count_over_time({job=\"sftpgw\"} [1m]))",
"legendFormat": "{{vm_name}}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"unit": "short"
}
}
},
{
"id": 6,
"title": "Auth Failures by VM",
"type": "timeseries",
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 8 },
"datasource": { "type": "loki", "uid": "loki" },
"targets": [
{
"expr": "sum by (vm_name) (count_over_time({job=\"sftpgw\"} |= \"USERAUTH_FAILURE\" [5m]))",
"legendFormat": "{{vm_name}}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"color": { "fixedColor": "red", "mode": "fixed" }
}
}
},
{
"id": 7,
"title": "Recent Activity Logs",
"type": "logs",
"gridPos": { "h": 10, "w": 24, "x": 0, "y": 16 },
"datasource": { "type": "loki", "uid": "loki" },
"targets": [
{
"expr": "{job=\"sftpgw\"}",
"refId": "A"
}
],
"options": {
"showTime": true,
"showLabels": true,
"showCommonLabels": false,
"wrapLogMessage": true,
"prettifyLogMessage": false,
"enableLogDetails": true,
"dedupStrategy": "none",
"sortOrder": "Descending"
}
}
]
}
- Click Load then Import
Once imported, your dashboard will show real-time metrics across all SFTP Gateway VMs.
Troubleshooting
No Logs Arriving in Event Hub
Symptom: Event Hub shows no incoming messages.
Check: Verify the Data Export Rule is active:
az monitor log-analytics workspace data-export list \
--resource-group $LAW_RG \
--workspace-name $LAW_NAME \
--output table
Common issues:
- Export rules can take up to 30 minutes to start flowing after creation
- The custom table names must match exactly (case-sensitive)
- The Event Hub Namespace must be in the same region as the Log Analytics Workspace
Function Not Triggering
Symptom: No function invocations in Azure Monitor.
Check: Verify the Event Hub connection string is set correctly:
az functionapp config appsettings list \
--name $FUNCTION_APP \
--resource-group $RESOURCE_GROUP \
--query "[?name=='EVENTHUB_CONNECTION'].value" -o tsv
Solution: Ensure the connection string has Listen permissions and the Event Hub names in function.json match the auto-created hubs (am-sftpgwaudit-cl, am-sftpgwapplication-cl).
Function Errors
Symptom: Function invocations fail with errors.
Check: View function logs:
az monitor app-insights query \
--app $FUNCTION_APP \
--resource-group $RESOURCE_GROUP \
--analytics-query "traces | where timestamp > ago(10m) | order by timestamp desc | take 20"
Common issues:
connection refused— Loki isn't running or NSG doesn't allow port 3100timeout— VM is stopped or unreachableURLError— Check that theLOKI_URLapp setting is correct
No Logs in Loki
Symptom: Function succeeds but Loki shows no data.
Check: Test Loki is accepting pushes:
curl -v -X POST "http://<OBS-VM-IP>:3100/loki/api/v1/push" \
-H "Content-Type: application/json" \
-d '{"streams":[{"stream":{"test":"true"},"values":[["'$(date +%s)000000000'","test log"]]}]}'
Solution: Ensure Loki is running (docker ps on the VM) and binding to 0.0.0.0:3100 (not 127.0.0.1).
VM Name Shows as "unknown"
Symptom: The vm_name label is "unknown" instead of the actual VM name.
Check: Verify that the Computer field is present in your Log Analytics custom table records. Run a query in Log Analytics:
SFTPGWAudit_CL
| project Computer, RawData
| take 5
Solution: The Azure Monitor Agent should automatically populate the Computer field. If it's missing, verify the agent is properly installed and configured on your SFTP Gateway VMs.
VM IP Changed After Restart
Symptom: Function can't reach Loki after VM restart.
Solution: Update the Function App setting:
NEW_IP=$(az vm list-ip-addresses \
--name $VM_NAME \
--resource-group $RESOURCE_GROUP \
--query "[0].virtualMachine.network.publicIpAddresses[0].ipAddress" \
--output tsv)
az functionapp config appsettings set \
--name $FUNCTION_APP \
--resource-group $RESOURCE_GROUP \
--settings "LOKI_URL=http://${NEW_IP}:3100" \
--output none
Better solution: Assign a static public IP to the observability VM:
az network public-ip update \
--name "${VM_NAME}PublicIP" \
--resource-group $RESOURCE_GROUP \
--allocation-method Static
Production Considerations
Use a Static IP
Assign a static public IP to the observability VM so the address doesn't change on restart:
az network public-ip update \
--name "${VM_NAME}PublicIP" \
--resource-group $RESOURCE_GROUP \
--allocation-method Static
Use Private Networking
For production, place the Function App and observability VM in the same VNet to avoid exposing Loki to the public internet:
- Create a VNet integration for the Function App
- Place the observability VM in the same VNet
- Remove the public NSG rule for port 3100
- Update
LOKI_URLto use the VM's private IP
Enable HTTPS for Grafana
Add an nginx reverse proxy with an SSL certificate for secure Grafana access.
Log Retention
Configure Loki retention to manage storage:
# In loki-config.yaml
limits_config:
retention_period: 744h # Keep logs for 31 days
Monitoring the Function
Set up alerts for function failures:
# Create an alert rule for function errors
az monitor metrics alert create \
--name "SFTPGWLokiForwarder-Errors" \
--resource-group $RESOURCE_GROUP \
--scopes $(az functionapp show --name $FUNCTION_APP --resource-group $RESOURCE_GROUP --query id -o tsv) \
--condition "total FunctionExecutionCount > 0 where Result == 'Failed'" \
--window-size 5m \
--evaluation-frequency 5m \
--action "<YOUR-ACTION-GROUP-ID>"
Cleanup
To remove the Azure Monitor integration:
# Delete the data export rule
az monitor log-analytics workspace data-export delete \
--resource-group $LAW_RG \
--workspace-name $LAW_NAME \
--name "sftpgw-to-eventhub" \
--yes
# Delete the resource group (removes Function App, Event Hub, VM, and all resources)
az group delete --name $RESOURCE_GROUP --yes --no-wait
Summary
You've now set up an Azure Monitor → Loki integration that:
- Forwards SFTP Gateway logs from Log Analytics to Loki via Event Hub in real-time
- Extracts VM names from log records for HA visibility
- Preserves your existing Log Analytics logging setup
- Enables cross-instance search and visualization in Grafana
- Shows which VM handled each upload for HA verification
- Scales to multiple SFTP Gateway deployments
For questions or issues, contact support or refer to the Grafana Loki documentation.