Skip to content

Akshat1508/microservices-performance-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Microservices Performance Analysis πŸ“Š

A hands-on distributed systems benchmarking project that evaluates how a production-grade Kubernetes microservices deployment behaves under load β€” measuring CPU, memory, and network performance across single-node, multi-node, multi-cloud, and edge architectures.


🧭 Project Overview

This project progressively scales a microservices workload from a single Azure VM to a multi-node Kubernetes cluster, then to a multi-cloud topology spanning Azure and GCP, and finally to a real-world edge deployment where a frontend is pinned to an external node joining over the public internet. Each phase is benchmarked using real observability tooling.

Load Levels Tested: Low (10 users) Β· Medium (50 users) Β· High (200 users)


πŸ—οΈ Architecture

Phase 1 β€” Single-Node Deployment (Azure)

  • Single Ubuntu VM acting as both Kubernetes control plane and worker node
  • All 11 microservices scheduled on one machine
  • Baseline metrics captured; cross-node network traffic = 0 (all intra-host)
  • CPU spikes significantly under high load with no headroom

Phase 2 β€” Multi-Node Deployment (Azure)

  • Second Azure VM (worker-node-1) joined to the cluster via K3s agent
  • Kubernetes Scheduler redistributed pods across both nodes automatically
  • CPU pressure on primary node dropped significantly
  • Inter-node network traffic (private VNet) confirmed via Grafana node exporter

Phase 3 β€” Multi-Cloud Deployment (Azure + GCP)

  • K3s cluster spanned across two cloud providers over public WAN
  • Manager node on Azure; worker node on Google Cloud Platform (GCP)
  • Cross-cloud pod communication tunneled via Flannel VXLAN overlay network
  • Frontend pinned to GCP node via Kubernetes NodeSelector to simulate edge serving

Phase 4 β€” Edge Deployment (External Node over Public Internet)

  • A third node (edge-node) run by a collaborator joined the cluster over the public internet
  • No VPN or private network β€” the edge node connected directly to the manager's public IP on port 6443
  • Frontend deployment patched with a NodeSelector to run exclusively on edge-node
  • Kubernetes automatically terminated frontend pods on internal nodes and rescheduled them on the edge
  • Demonstrates real-world CDN-style edge serving using only K3s and Kubernetes primitives

Why the frontend runs multiple replicas (and why that matters on the edge):

Unlike backend services which run as a single pod, the Google Boutique demo configures the frontend with multiple replicas by design. Pinning these replicas to the edge node gives two concrete benefits:

  • Load Balancing β€” Incoming traffic is automatically distributed across all frontend replicas. Under high concurrency, no single pod bears the full load, preventing crashes at the user-facing entry point.
  • High Availability β€” If one frontend pod dies unexpectedly, the remaining replicas continue serving requests while Kubernetes silently rebuilds the failed one. The website stays online with zero manual intervention.

πŸ› οΈ Tech Stack

Tool Role
K3s Lightweight, production-grade Kubernetes distribution
Helm Kubernetes package manager for deploying third-party charts
Google Boutique Demo 11-microservice app in Go, Python, C#, Node.js over gRPC
Prometheus Time-series metrics scraping (CPU, memory, network)
Grafana Dashboard visualization connected to Prometheus
Locust Python-based distributed load testing tool
Azure Primary cloud β€” manager node host (Phases 1–4)
GCP Secondary cloud β€” worker node host (Phase 3)
External VM Collaborator machine β€” edge node host (Phase 4)

πŸ“ Repository Structure

microservices-performance-analysis/
β”œβ”€β”€ README.md
β”œβ”€β”€ locustfile.py
β”œβ”€β”€ locust-reports/
β”‚   β”œβ”€β”€ Locust_Single_Node_Low_Load.html
β”‚   β”œβ”€β”€ Locust_Single_Node_Medium_Load.html
β”‚   β”œβ”€β”€ Locust_Single_Node_High_Load.html
β”‚   β”œβ”€β”€ Locust_Multi_Node_Low_Load.html
β”‚   β”œβ”€β”€ Locust_Multi_Node_Medium_Load.html
β”‚   β”œβ”€β”€ Locust_Multi_Node_High_Load.html
β”‚   β”œβ”€β”€ Locust_Multi_Cloud_Low_Load.html
β”‚   β”œβ”€β”€ Locust_Multi_Cloud_Medium_Load.html
β”‚   └── Locust_Multi_Cloud_High_Load.html
└── images/
    └── benchmarks/
        β”œβ”€β”€ single-low-*.png             (6 files)
        β”œβ”€β”€ single-medium-*.png          (6 files)
        β”œβ”€β”€ single-high-*.png            (6 files)
        β”œβ”€β”€ multi-low-*.png              (6 files)
        β”œβ”€β”€ multi-medium-*.png           (6 files)
        β”œβ”€β”€ multi-high-*.png             (6 files)
        β”œβ”€β”€ Multicloud-Low-*.png         (5 files)
        β”œβ”€β”€ Multicloud-Medium-*.png      (5 files)
        β”œβ”€β”€ Multicloud-High-*.png        (5 files)
        β”œβ”€β”€ edge-nodes-registered.png    (kubectl get nodes showing all 3 nodes)
        └── edge-pods-distribution.png   (kubectl get pods -o wide showing frontend on edge-node)

πŸš€ Setup & Reproduction

Prerequisites: Azure + GCP accounts Β· Windows machine with PowerShell Β· SSH keys for each cloud

Step 1 β€” SSH into Your Azure VM (Windows PowerShell)

Windows OpenSSH strictly requires private key files to be protected from group access. Run PowerShell as Administrator:

$KeyPath = "$env:USERPROFILE\Downloads\azure-key.pem"
icacls $KeyPath /inheritance:r
icacls $KeyPath /grant:r "$($env:USERNAME):R"
ssh -i $KeyPath azureuser@<MANAGER_PUBLIC_IP>

Step 2 β€” Install K3s (on the Azure VM)

curl -sfL https://get.k3s.io | sh -
sudo k3s kubectl get nodes

Step 3 β€” Deploy the Boutique App (11 Microservices)

sudo k3s kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/main/release/kubernetes-manifests.yaml
sudo k3s kubectl get pods -o wide --watch

Step 4 β€” Install Helm & Deploy the Observability Stack

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
sudo k3s kubectl create namespace monitoring
sudo k3s helm install observability prometheus-community/kube-prometheus-stack --namespace monitoring

Retrieve the Grafana admin password:

sudo k3s kubectl get secret --namespace monitoring observability-grafana \
  -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

Username defaults to admin

Step 5 β€” Expose Services & Tunnel Locally

On the Azure VM:

sudo k3s kubectl port-forward svc/frontend 8080:80 --address 0.0.0.0 &
sudo k3s kubectl port-forward svc/observability-grafana 3000:80 -n monitoring --address 0.0.0.0 &

On your local Windows machine:

ssh -i "$env:USERPROFILE\Downloads\azure-key.pem" `
    -L 8080:localhost:8080 `
    -L 3000:localhost:3000 `
    -L 8089:localhost:8089 `
    azureuser@<MANAGER_PUBLIC_IP>
Service Local URL
Frontend UI http://localhost:8080
Grafana http://localhost:3000
Locust UI http://localhost:8089

Step 6 β€” Run Load Tests

sudo apt-get install -y python3-pip
pip3 install locust
locust -f locustfile.py

Headless CLI:

locust -f locustfile.py --headless -u 10 -r 2 --run-time 3m --host=http://localhost:8080 --csv=benchmark_low_load
locust -f locustfile.py --headless -u 50 -r 5 --run-time 3m --host=http://localhost:8080 --csv=benchmark_medium_load
locust -f locustfile.py --headless -u 200 -r 10 --run-time 3m --host=http://localhost:8080 --csv=benchmark_high_stress

πŸ“ˆ Scaling to Multi-Node (Phase 2)

# On manager β€” get join token
sudo cat /var/lib/rancher/k3s/server/node-token

# On worker-node-1 β€” join the cluster
curl -sfL https://get.k3s.io | \
  K3S_URL=https://<MANAGER_PRIVATE_IP>:6443 \
  K3S_TOKEN=<COPIED_NODE_TOKEN> sh -

# On manager β€” verify and force rescheduling
sudo k3s kubectl get nodes
sudo k3s kubectl delete pods --all
sudo k3s kubectl get pods -o wide

☁️ Multi-Cloud Setup (Phase 3 β€” Azure + GCP)

Firewall Ports to Open (on both clouds)

Port Protocol Purpose
6443 TCP K3s API Server
10250 TCP Kubelet metrics
8472 UDP Flannel VXLAN overlay (cross-cloud pod networking)

Join GCP Node to Azure Cluster

curl -sfL https://get.k3s.io | \
  K3S_URL=https://<AZURE_MANAGER_PUBLIC_IP>:6443 \
  K3S_TOKEN=<COPIED_NODE_TOKEN> sh -s - \
  --node-external-ip=<GCP_VM_PUBLIC_IP>

Pin Frontend to GCP Node

sudo k3s kubectl label nodes <GCP_NODE_NAME> node-role.kubernetes.io/edge=true
sudo k3s kubectl patch deployment frontend -p \
  '{"spec": {"template": {"spec": {"nodeSelector": {"node-role.kubernetes.io/edge": "true"}}}}}'
sudo k3s kubectl get pods -o wide

🌐 Edge Deployment (Phase 4 β€” External Node over Public Internet)

This phase connects a completely external node (run by a collaborator on a separate machine) to the cluster over the public internet β€” no VPN, no private network.

Step 1 β€” Open Firewall Port on Azure

In the Azure Portal, navigate to benchmark-vm β†’ Networking β†’ Add inbound port rule:

Setting Value
Destination port 6443
Protocol TCP
Name Allow-K3s-Edge

Step 2 β€” Get the Cluster Join Token (on manager)

sudo cat /var/lib/rancher/k3s/server/node-token

Step 3 β€” Join the Edge Node (on the external/collaborator machine)

The collaborator SSHs into edge-node and runs:

curl -sfL https://get.k3s.io | \
  K3S_URL=https://<MANAGER_PUBLIC_IP>:6443 \
  K3S_TOKEN=<COPIED_NODE_TOKEN> sh -

Note: Unlike Phase 2 which used a private IP, this uses the public IP β€” the edge node is outside the network entirely.

Step 4 β€” Label the Edge Node (on manager)

# Verify edge-node appears as Ready
sudo k3s kubectl get nodes

# Tag it as the edge device
sudo k3s kubectl label nodes edge-node node-role.kubernetes.io/edge=true

Step 5 β€” Pin Frontend to the Edge Node (on manager)

sudo k3s kubectl patch deployment frontend -p \
  '{"spec": {"template": {"spec": {"nodeSelector": {"node-role.kubernetes.io/edge": "true"}}}}}'

Step 6 β€” Verify Frontend Migrated to Edge

sudo k3s kubectl get pods -o wide

All frontend-* pods should now show edge-node under the NODE column.


πŸ“Š Benchmark Results

Phase 1: Single-Node (Azure)

Low Load (10 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

Medium Load (50 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

High Load (200 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

Phase 2: Multi-Node (Azure)

Low Load (10 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

Medium Load (50 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

High Load (200 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

Phase 3: Multi-Cloud (Azure + GCP)

Low Load (10 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

Medium Load (50 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

High Load (200 Users)

Metric Screenshot
CPU Usage
CPU Quota
Memory Usage
Memory Quota
Network I/O
Locust Dashboard

Phase 4: Edge Deployment

No load testing was performed in this phase. The goal was to validate cross-internet node joining and frontend pod migration to the edge node.

What Screenshot
All 3 nodes registered and Ready
Frontend pods running on edge-node

πŸ“„ Locust Reports

Full HTML reports are available in the locust-reports/ folder.

Phase Load Report
Single-Node Low Locust_Single_Node_Low_Load.html
Single-Node Medium Locust_Single_Node_Medium_Load.html
Single-Node High Locust_Single_Node_High_Load.html
Multi-Node Low Locust_Multi_Node_Low_Load.html
Multi-Node Medium Locust_Multi_Node_Medium_Load.html
Multi-Node High Locust_Multi_Node_High_Load.html
Multi-Cloud Low Locust_Multi_Cloud_Low_Load.html
Multi-Cloud Medium Locust_Multi_Cloud_Medium_Load.html
Multi-Cloud High Locust_Multi_Cloud_High_Load.html

Note: GitHub doesn't render HTML files inline β€” download and open in a browser for the full interactive report.


πŸ”‘ Key Findings

  • Horizontal scaling works: Adding a second node significantly reduced CPU pressure on the primary node under identical load
  • Multi-cloud is viable: Spanning the cluster across Azure and GCP over public WAN using Flannel VXLAN worked reliably, with inter-cloud pod communication confirmed via Grafana
  • Edge deployment works over raw internet: An external node joined the cluster over the public internet with zero private networking β€” just a firewall port and a join token
  • Kubernetes NodeSelector is a powerful edge primitive: Pinning the frontend to edge-node caused Kubernetes to automatically migrate all frontend pods with no manual restarts
  • Replicas make the edge resilient: Running multiple frontend replicas on the edge node means traffic is load-balanced across all of them, and if any single pod fails, the others keep serving users while Kubernetes self-heals in the background
  • Observability is essential: Without Prometheus + Grafana, performance differences across phases would be invisible

πŸ”§ Troubleshooting

Port already in use after reconnecting:

sudo fuser -k 8080/tcp
sudo fuser -k 3000/tcp

Deallocating VMs to stop billing: Azure Portal / GCP Console β†’ Virtual Machines β†’ select instance β†’ Stop β†’ confirm deallocated state


πŸ—ΊοΈ Roadmap

  • Phase 1: Single-node Azure deployment
  • Phase 2: Multi-node Azure cluster
  • Phase 3: Multi-cloud deployment (Azure + GCP)
  • Phase 4: Edge topology β€” external node over public internet

🧰 References

About

Comparative load testing of a microservices architecture on single-node vs. multi-node edge clusters.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages