Hermes Agent Operator's Manual
The Operator’s Manual for Hermes Agent Building an AI assistant that can act, remember, and improve Operator’s Manual · Edition 3.2 · Verified against official Nous Research documentation About This Manual This manual explains how to deploy and operate Hermes Agent as a persistent “operator” — an AI system that runs continuously, uses tools, remembers context across sessions, and improves over time — rather than as a single-session chatbot. It covers architecture, installation, the core mental model, day-to-day workflows, the operator loop, common failure modes, advanced configuration (including offline skill optimization with GEPA), and a distilled set of operational lessons. ...
Alert Render Test
Regular blockquote: This is just a quote, no alert syntax. GitHub-style alerts: 📝Note A note alert. 💡Tip A tip alert. ❗Important An important alert. ⚠️Warning A warning alert. 🔥Caution A caution alert.
DGX Spark + vLLM Playbook
title: “Running vLLM on NVIDIA DGX Spark: The Complete Playbook” description: “Serve LLMs with vLLM on the NVIDIA DGX Spark (GB10): 13 copy-paste NVFP4 recipes for Qwen, Gemma, Nemotron and gpt-oss, the sm121 Marlin fix, MTP/DFlash/EAGLE-3 speculative decoding, measured GB10 benchmarks, and the container→vLLM-version table that actually matters.” summary: “Per-model NVFP4 recipes for DGX Spark (GB10): Qwen, Gemma, Nemotron and gpt-oss, plus MTP/DFlash/EAGLE-3 speculative decoding and real measured benchmarks.” tldr: ...
The Modern Ubuntu Bash Terminal Setup
This is a long-form, opinionated guide to setting up a terminal that’s both pretty (syntax-highlighted, themed, autosuggesting) and productive (fuzzy everything, smart history, per-project Python envs, modern replacements for the classic Unix tools). It’s everything I wish I’d known before assembling the stack — including the half-dozen subtle ordering and key-binding issues that ate a couple of evenings of my life. Target: Ubuntu 22.04 or newer, with bash as your shell. No zsh, no fish — bash all the way. The reason: it’s the default, it’s everywhere, and with ble.sh it gets ~95% of zsh’s quality-of-life features. ...
LLM Quantization
Quantization is the single most important technique for running large language models outside a datacenter. It is what turns a model that needs eight enterprise GPUs into one that runs on a gaming card, a laptop, or a Mac mini. But the moment you go to download a model, you are confronted with an intimidating wall of cryptic names — Q4_K_M, IQ3_XXS, UD-Q5_K_XL, GPTQ-Int4, AWQ, NF4, EXL3, NVFP4 — with little explanation of what they mean or which one you should pick. ...
DGX Spark + LlamaCPP Playbook
Complete Setup & Operations Guide Everything needed to build, run, update, and operate local LLMs on an NVIDIA DGX Spark (GB10 / sm_121) with llama.cpp and the llm helper command. 1. How the pieces fit The Spark (GB10). Blackwell GPU at compute capability 12.1 (sm_121), 128 GB unified LPDDR5x shared between CPU and GPU, ~273 GB/s memory bandwidth. Bandwidth is the bottleneck for token generation, so Mixture-of-Experts (MoE) models with few active parameters run far faster than dense models of the same total size. Prefer MoE. ...
Minisforum A2
I bought a Minisforum MS-A2, lived with it for months, modified most of it, pushed it harder than most people will, and then sold it. This review is the long answer to why, and it isn’t a clean recommendation either way. The MS-A2 is one of the most impressive small machines you can buy. It’s also one I’d never put on my desk or in my living room. I’ll explain how both of those are true. ...
Q-feeds
Q-Feeds delivers curated indicators of compromise (IPs and domains) on a schedule. The OPNsense plugin is purpose-built to consume the IP feeds, and the official documentation assumes you’ll feed the domain side into Unbound. If you’re running AdGuard Home as your primary DNS resolver instead of Unbound — as I am — that integration path doesn’t apply directly, and you have to wire the domain feeds in manually. A two-layer threat intelligence setup is only as good as the DNS path that feeds it. This post walks through wiring Q-Feeds into OPNsense (IP layer) and AdGuard Home (DNS layer), and then — the part that turned out to matter most — actually forcing every device on the network to use that DNS path, instead of just offering it. ...