Flaviu Vlaicuwhois

Cybersecurity | DevOps | HomeLab | HomeAutomation

Hermes Agent Operator's Manual

The Operator’s Manual for Hermes Agent Building an AI assistant that can act, remember, and improve Operator’s Manual · Edition 3.2 · Verified against official Nous Research documentation About This Manual This manual explains how to deploy and operate Hermes Agent as a persistent “operator” — an AI system that runs continuously, uses tools, remembers context across sessions, and improves over time — rather than as a single-session chatbot. It covers architecture, installation, the core mental model, day-to-day workflows, the operator loop, common failure modes, advanced configuration (including offline skill optimization with GEPA), and a distilled set of operational lessons. ...

May 24, 2026 ·  47 min

DGX Spark + LlamaCPP Playbook

Complete Setup & Operations Guide Everything needed to build, run, update, and operate local LLMs on an NVIDIA DGX Spark (GB10 / sm_121) with llama.cpp and the llm helper command. 1. How the pieces fit The Spark (GB10). Blackwell GPU at compute capability 12.1 (sm_121), 128 GB unified LPDDR5x shared between CPU and GPU, ~273 GB/s memory bandwidth. Bandwidth is the bottleneck for token generation, so Mixture-of-Experts (MoE) models with few active parameters run far faster than dense models of the same total size. Prefer MoE. ...

June 17, 2026 ·  26 min · 
TL;DR
  • Build llama.cpp for the GB10 (sm_121) with LLAMA_OPENSSL=ON and the 121a native-FP4 target.
  • Serve any GGUF model over an OpenAI-compatible API with one command: llm run <model> [port].
  • All the Spark tuning is baked in — --no-mmap, flash-attention, q8_0 KV cache, batch 2048, 20 threads.
  • 121a adds native FP4 (MXFP4/NVFP4) speedups; it’s neutral on standard quants like Q8_0 and Q4_K_M.
  • Prefer MoE models: the Spark is memory-bandwidth-bound, so low active-parameter models run fastest.
  • Manage everything with the llm helper: run, stop, ps, ls, wait, test, speed, log, update.
  • Wire Hermes or Open WebUI to http://:/v1; runnable = GGUF + supported arch + ≤ ~200B.
  • Includes the full llm script, a cheatsheet, and a troubleshooting table.

Minisforum A2

I bought a Minisforum MS-A2, lived with it for months, modified most of it, pushed it harder than most people will, and then sold it. This review is the long answer to why, and it isn’t a clean recommendation either way. The MS-A2 is one of the most impressive small machines you can buy. It’s also one I’d never put on my desk or in my living room. I’ll explain how both of those are true. ...

June 12, 2026 ·  23 min

Q-feeds

Q-Feeds delivers curated indicators of compromise (IPs and domains) on a schedule. The OPNsense plugin is purpose-built to consume the IP feeds, and the official documentation assumes you’ll feed the domain side into Unbound. If you’re running AdGuard Home as your primary DNS resolver instead of Unbound — as I am — that integration path doesn’t apply directly, and you have to wire the domain feeds in manually. A two-layer threat intelligence setup is only as good as the DNS path that feeds it. This post walks through wiring Q-Feeds into OPNsense (IP layer) and AdGuard Home (DNS layer), and then — the part that turned out to matter most — actually forcing every device on the network to use that DNS path, instead of just offering it. ...

May 7, 2026 ·  25 min

Claude Code Self Evolving

Most Claude Code setups are static. You write a CLAUDE.md, list your conventions, and hope Claude follows them. When it doesn’t, you correct it. Next session, it forgets. You correct it again. This guide builds something different: a system where every correction you make gets captured and logged, repeated corrections automatically become permanent rules, discovered patterns get verified before they’re trusted, and a periodic audit command decides what stays, what gets promoted, and what gets pruned. ...

April 1, 2026 ·  33 min

Mosh FIDO2 / Yubikey Fix

Problem When using mosh with a FIDO2-backed SSH key (sk-ed25519 / sk-ecdsa, e.g. YubiKey), the touch prompt is never shown. The YubiKey blinks — meaning it received the signing request — but the terminal hangs silently until timeout. This affects any tool that invokes SSH as a subprocess without a proper controlling TTY, including mosh and ansible. Root Cause Mosh calls SSH internally with the -n flag: ssh -n -tt -S none -o ProxyCommand=... <host> -- mosh-server new ... The -n flag redirects SSH’s stdin from /dev/null. libfido2 needs a real /dev/tty to print the touch prompt. With -n in effect, the signing request reaches the YubiKey hardware (hence the blinking) but the prompt is swallowed and there is no way to respond. ...

March 6, 2026 ·  7 min

Mosh - The SSH Replacement You Didn't Know You Needed

If you’ve ever had an SSH session freeze mid-command because you switched from Wi-Fi to mobile, or lost your work because a hotel network dropped for three seconds, Mosh is the tool that fixes all of that. What is Mosh? Mosh (Mobile Shell) is a remote terminal application that replaces SSH for interactive sessions. It uses SSH only for the initial authentication handshake, then hands off to its own UDP-based protocol (SSP — State Synchronization Protocol) for the actual terminal session. ...

March 6, 2026 ·  5 min · 
TL;DR
  • Mosh replaces SSH for interactive sessions, using UDP so it survives roaming and network drops.
  • Open UDP ports 60000-61000 on the server; auth still piggybacks on SSH.
  • On macOS, fix the PATH in ~/.zshenv so non-interactive SSH can find mosh-server.
  • Pair it with tmux for a session that survives almost anything short of a server reboot.

Seamless Python Environment Management on macOS

uv + direnv A manual, lightweight approach to Python virtual environment management that auto-activates when you cd into a project and deactivates when you leave — without ever running source .venv/bin/activate again. Why This Approach? Traditional Python workflows require manually activating and deactivating virtual environments. Forget to activate? You install packages globally. Forget to deactivate? You pollute one project with another’s dependencies. This setup eliminates that entire class of mistakes. ...

February 18, 2026 ·  4 min