Skip to main content

Command Palette

Search for a command to run...

Why I am Moving my AI "Agents" to the Edge (and Why You Should Too)

The "Offline" Advantage for First Responders

Updated
3 min read
Why I am Moving my AI "Agents" to the Edge (and Why You Should Too)
K
I look at technology through the lens of resilience. As a Systems Engineer in a mission-critical public safety environment, I have learned that uptime isn't just a goal; it is a requirement. This high-stakes mindset drives my work as a Digital First Responder, where I focus on architecting secure systems on the VertexOps platform. My engineering pragmatism is shaped by my experience in emergency management and community response. I prioritize clear documentation and building systems that remain stable when things get chaotic. CURRENT FOCUS AREAS: LOCAL AI AND DIGITAL SOVEREIGNTY: Scaling local inference stacks using Ollama and LiteLLM on physical hardware like the Dell T3610 to ensure privacy and accountability. INFRASTRUCTURE RESILIENCE: Managing enterprise virtualization environments and self-hosted clouds where data ownership is non-negotiable. CYBERSECURITY AND GOVERNANCE: Hardening systems against modern threats, specifically focusing on OAuth supply chain security and AI red teaming. When I am not at a terminal, I am likely operating under my Amateur Radio license, KO6JKE. Troubleshooting a radio link and debugging a network stack require the same tinkerer soul and a commitment to keeping lines of communication open.

I have been watching the "Agentic AI" trend blow up on Hashnode lately. It seems like every other post is about how AI is moving from just answering questions to actually doing things—writing code, managing QA, and even handling incident triage. It is exciting stuff, but as someone who works in public safety, my first thought is always the same: What happens when the cloud goes dark?

In my line of work, we talk about "resilience" a lot. Whether it is my day job or volunteering with Sacramento CERT, you learn pretty quickly that if your tools depend on a perfect internet connection and a third-party server's uptime, you do not actually own those tools.

That is why I have been spending my nights in my home lab (shoutout to my trusty Dell T3610) moving away from the "cloud-first" mindset.

The Shift to the Edge

With the release of Gemma 4 and Qwen 3.5, the gap between "cloud AI" and "local AI" has basically evaporated for most practical tasks. I have been testing these models via Ollama, and the performance on consumer-grade hardware is getting insane.

Here is why this matters for those of us building infrastructure:

  1. Privacy is non-negotiable: If you are working with sensitive data—whether it is public safety info or just your own personal projects—sending that to a proprietary cloud model is a risk. Keeping it local means you keep the keys.

  2. True Resilience: If the grid goes sideways or the fiber gets cut, my local LLM keeps running. For an "Agent" to be useful in a real emergency, it has to be reachable.

  3. Latency: When you are running a local model on your own metal, you are not waiting on API calls or rate limits. It just works.

What is in my Stack?

I am currently leaning heavily on a self-hosted setup that looks something like this:

  • Hypervisor: VMware ESXi 8 (standard stuff, but rock solid).

  • Model Runner: Ollama, pulling the latest Qwen and Gemma weights.

  • Orchestration: Exploring how to use these local models for basic "agentic" tasks like automated log analysis and system hardening.

Why this matters

I have always liked platforms that focus on community and shared knowledge. The tech sector needs more of that "civic" mindset. We should be building systems that empower people, not just systems that make us dependent on a few giant corporations.

If you are just starting with local LLMs, my advice is to stop worrying about the benchmarks and just start building. Setup an old workstation, install Linux, and see what you can make it do without an internet connection. You might be surprised at how much power you actually have sitting under your desk.

I am curious—how many of you are actually running your "Agents" locally vs. relying on Claude or GPT-5? Let’s talk about it in the comments.

More from this blog

T

The Digital First Responder | Systems Engineering & Mission Critical IT

16 posts

I'm Kerry Kier -- a systems engineer working at the intersection of infrastructure resilience, emergency communications, and practical AI deployment. I write about the things I'm actually building, breaking, and figuring out: self-hosted AI stacks, security architecture, DevOps pipelines, and what happens when mission-critical systems meet the real world. This isn't a thought leadership blog. It's field notes.