~/haas
researchwritingsystemsabout

Status: Active

Building evaluation infrastructure for AI systems.
San Francisco.

emailgithubtwitter
~/haas
researchwritingsystemsabout

Status: Active

Building evaluation infrastructure for AI systems.
San Francisco.

emailgithubtwitter
~/haas
researchwritingsystemsabout
←writing

The Homelab That Replaced My Cloud Bill

April 11, 2024·1 min read

I spent $2,000 on hardware that now handles workloads that would cost $500/month on AWS. The cloud is a tax on people who cannot be bothered to learn infrastructure.

#engineering#technical#personal-growth

The cloud is a tax on people who cannot be bothered to learn infrastructure.

I spent 2,000onhardwarethatnowhandlesworkloadsthatwouldcost2,000 on hardware that now handles workloads that would cost 2,000onhardwarethatnowhandlesworkloadsthatwouldcost500+ per month on AWS. That is a four-month payback period. Everything after that is pure savings, plus complete control over my data and zero surprise bills.

The serverless crowd will tell you infrastructure is a distraction from "real work." They are wrong. Understanding infrastructure is a competitive advantage. It makes you a better engineer, saves you money, and eliminates entire categories of vendor lock-in.

This post is the technical blueprint for a homelab that handles enterprise-grade workloads at a fraction of the cloud cost.

Core Architecture

Network Controller & Gateway

The Dream Machine Pro Max serves as my core network controller and gateway. This is not overkill for home use. This is right-sized for someone who refuses to pay AWS rent forever:

  • Running multiple isolated VLANs for IoT, lab environment, and production services
  • Supporting container-based services with high throughput requirements
  • Managing multiple site-to-site VPNs for remote access
  • Handling IPS/IDS for network security monitoring
  • Supporting redundant storage for security footage retention

The ability to handle 10G throughput became essential when I started running distributed storage systems and container orchestration across my lab environment.

Network Segmentation

My current VLAN structure:

VLAN 10: Management (network devices, controllers)
VLAN 20: Lab Environment (kubernetes, storage clusters)
VLAN 30: IoT Devices
VLAN 40: Media Streaming
VLAN 50: Guest Network
VLAN 60: Security Systems

Each VLAN has specific firewall rules and traffic policies to maintain security
while allowing necessary inter-VLAN routing.

### Physical Infrastructure

The network backbone consists of:

- Layer 3 Pro Max 24 PoE switch handling inter-VLAN routing
- U7 Pro Max AP for high-density wireless coverage
- Redundant power supplies for critical infrastructure
- 10G fiber interconnects between core components

The PoE budget becomes crucial when you're running:

- Multiple security cameras with continuous recording
- Environmental sensors
- Access points
- IP phones
- Various IoT controllers

## Storage Architecture

The storage system is built around:

- UNVR with 4x 16TB drives in RAID 10
- Dedicated NAS for lab environment backups
- Edge caching for frequently accessed content

Current storage allocation:

- 30% Security footage (with 30-day retention)
- 40% Lab environment (VMs, containers, test data)
- 20% Media storage
- 10% System backups

## Network Services

Currently running these core services:

1. Network Monitoring:

   - Prometheus for metrics collection
   - Grafana for visualization
   - Custom alerting via webhook integration

1. Security:

   - IPS/IDS with custom rulesets
   - Network flow analysis
   - Automated threat detection

1. Lab Environment:
   - Kubernetes cluster for container orchestration
   - CI/CD pipeline for testing
   - Development environments

## Technical Challenges & Solutions

### Challenge 1: High-Density WiFi Coverage

- Initial deployment showed dead zones
- Solution: Added mesh networking with wireless uplink
- Result: Consistent coverage with seamless roaming

### Challenge 2: Power Management

- Initial PoE budget calculations were insufficient
- Solution: Implemented power scheduling and upgraded PSU
- Result: Stable power delivery with 25% headroom

### Challenge 3: Storage I/O

- Network recording created I/O bottlenecks
- Solution: Implemented edge caching and storage tiering
- Result: 70% reduction in main storage I/O

## Future Architecture Plans

1. Technical Improvements:

   - Implementing BGP for more robust routing
   - Adding redundant internet connections
   - Expanding kubernetes cluster

1. Infrastructure Expansion:
   - Additional compute nodes for lab environment
   - Enhanced monitoring and logging
   - Automated failover systems

## The Bottom Line

Every month I do not pay AWS is a month my hardware pays for itself again. After four months, the math was settled. After a year, the savings funded the next upgrade.

The serverless advocates have a point: infrastructure is work. But so is paying rent forever. So is debugging Lambda cold starts at 2am. So is explaining to your CFO why your cloud bill doubled because someone forgot to set a timeout.

Own your infrastructure. Understand your systems. Stop paying the cloud tax.

The skills you build running a homelab transfer directly to production environments. The money you save transfers directly to your bank account. The control you gain is priceless.

Build the homelab. Kill the cloud bill. Never look back.
````plaintext

share

Continue reading

Building My Blog: A Modern React + TypeScript Journey

In this post, I'll walk you through the process of building this blog using modern web technologies. From the initial setup to the final deployment, I'll sha...

Building the HTTP for Agents: A Complete Guide to Agent Infrastructure

Most teams are not ready for what is coming. Autonomous agents are not just prototypes anymore...

Technical Debt Isn't Just Slowing You Down—It's Accelerating

Your team shipped 12 features last quarter. This quarter, with the same people and same effort, you shipped 8.

Status: Active

Building evaluation infrastructure for AI systems.
San Francisco.

emailgithubtwitter