I learned. I did. I turned it off.

With all the OpenClaw and Hermes chatter going around, I thought I would give it a try. I don’t have a lot of money for tokens, and considering I wanted more of a clerk, note taker and administrator, I figured I could use a local LLM.

Hermes + Obsidian + Qwen + Discord

Hermes

I decided to use Hermes over OpenClaw. Being neurodivergent, what I sometimes think I want and do, to put it politely, is at times at odds with what I actually want and do. Based on readings and investigation Hermes has baked in the opportunity to learn and adapt over time. And I was hoping over time, it may learn or uncover those times.

After some additional thought, I decided to put it in a nice walled docker garden to make sure the files it accesses stay the files it access and it doesn’t see the rest of my computer. Reason being, after doing a bit of Agentic Development over the course of the year, there are many times when I see it sneaking out of its parameters with a little cat command here, or more a more dangerous sed command there.

Watching Tonbi’s AI Garage’s Hermes Masterclass was very helpful and afterwards made it honestly the simplest part of the process.

Signal, Discord

Starting with a simple model ( I’ll talk about that later ), I needed a way to interact with my Agent. I started with Signal. I ran it via the “Notes to Self” mode, which for a bit was kinda cool. Me and this local AI are just bantering away. It was kinda cool.

But then, I wanted it to remind me of things, like to eat lunch, which I often forget. Or check-in and ask me questions on how my day was, so that I remembered to actually write something in my journal. And then the Note To Self method had an obvious gap. No notifications.

So off to Discord.

The Discord set-up is pretty simple enough.

With Hermes you can lock down channels, or have open response channels. A feature I liked was you can prepend context to specific channels. i.e. in my #Notes channel, I added my newly minted “/note ” skill; “#Journal", you guessed it “/journal “.

This worked great. More channels to banter, and now with notifications

Formatting and Mulling

I found Hermes to be a bit loud in Discord.

Ping, “⏳ Still working…”

Ping, “⏰ cronjob: ‘run’.”

PING, “📝 skill_manage: ‘journal’”

So, of course using AI, I asked it to fix that. Using hooks worked really well.Here’s that code.

Now, that I turned the volume down on some of what it was saying, what it was saying was a different matter altogether.

Gemma4, Gemma4:16k, Hermes3, Hermes3:16k, Qwen 3.5, Qwen 3.5:16k

The model is everything! And sadly..

Size Matters

Given I have no money, cloud was out. There’s a trend lately that locally run agents are getting financially discouraged. Sure you can use them, but that API usage isn’t included in default plans. I’ve got $20 a month. Hard line. And I’m not about to eat that annual budget in 4 months.

Larger models were out too. Anything tagged greater than 8b was a no-go. I don’t have a super computer. My computer would make some people even question if I was a developer at all. In which case I say, “Hey, my laptop is Ubuntu.” But this wasn’t going to run on my laptop. It was going to run on my mac mini with 16GB Ram, which isn’t exclusivly an AI machine. I actually want to use it at the same time. Every day.

So, I played a game of eennie-meenie-mynie-moe. I landed on Gemma4 first.

But here’s the twist I didn’t expect, Hermes is very, and I mean very picky on context models. If in your config file you have anything less than context_length: 64000 things go bad.

With smaller models, we then need to do a bit of hackery on them. To accept the larger context, but quietly say, “no thank you I’m full”.

So, to get around it, I had to figure out how to “create a permanent custom model“. Turns out it’s pretty simple.

Download you model:
ollama run someonesModel:8b
Extract a Model file:
ollama show someonesModel:8b --modelfile > myModel-32k.modelfile
edit that modelfile and update:
PARAMETER num_ctx to something small (8192 = 8K, 32768 = 32K, or 65536 = 64K:

FROM llama3.2 PARAMETER num_ctx 32768

Note: I tried a few values but found 32K a good balance.Too small and it cuts off thoughts or skills. Too big, and you’re eating that RAM and your computer crashes.
Then create a new model:
ollama create myModel:32k -f ./myModel-32k.modelfile

Simple… right???

What’s my Flavour?

Once I figured all that out… It started to become obvious to me that my neurodivergent interactions were troublesome to some models.

I have a… style.

See that ellipse? It’s a bad habit, I know.

My style is distinct. I divert. I hop around. I smatter on at times.

Cloud LLMs are smart enough to follow. A piddly little truncated local LLM? Turns out, some, for me, are very easy to confuse and distract.

I tried Gemma4 and Hermes3, but not for me. I’m sure if it was a more precise thinker, or neurotypical, wrangling these models may have been easier. But I’m not and I was getting frustrated and tired of trying at this point.

I decided one more model, or stop.

I picked Qwen 3.5, and it seemed to me to be the best at keeping up.

The Obsidian Brain

My Obsidian daily journal has been a helpful tool for me. I enter quick thoughts, observations and findings of the day, or when longer form content is necessary, links to various other sub topics and trains of thought. I use hashtags to then help weave my web.

One of my big goals was to ultimately populate this daily journal:

“Did you eat today?” Make notes in the journal, which may relate to any mood notes.
“What do you have planned today?” “Have you had a chance to work on XYZ yet?” Make notes in my daily journal on how I responded.

I would have thought this was easy. It’s markdown. Simple text file really.

It oddly was not.

It’s likely the fault of how I like to layout my template. It’s likely the fault of a smaller model. But man this was painful. SO much waste, duplication, just not “getting it”.

“Please make the following note and place it under the #Journal heading in my daily journal” I ask.

“OK. Done,” with the whole file repeated and reworded in triplicate every time it touched it.

“Update the note skill,” I ask, “to ensure before inserting any new content, it reads the whole file to ensure content isn’t already in there. If it is, leave it, of not, then add it, or update accordingly”

“OK. Done.” The file is empty, and a new one is dated in the future. ( oh, when using docker… make sure you set the timezone to your timezone. )

“Grrrr!” I smash the desk.

Do you actually have skill?

On one hand I love skills. They are cool concepts to outline preferences and desired behaviours. But how, in all that is holy, do people “debug” when clearly it’s not getting it????

I wrote, and re-wrote, I was explicit to a fault, only then to lighten it up. I tried this sentence and that. Asked it to suggest its own sentence. Said screw it, I’ll ask Gemini for suggestions. Use that, only to lose my mind again and again, obsessing with how these things are and aren’t working.

Why are you coaching and AI-splaining when I want you to just shut up and update the file? Did I not ask you to stop the puffery and platitudes? Why are you telling me how wonderful my idea is to track my mood?

I quit.

There I said it. I quit. I gave it a couple weeks. Played and tried and no more.

I’m sure with $200 per month to burn through tokens it would have been a great experience.

What now?

Well, I’m not really giving up. I am just giving up on a local LLM and need to be a tad more open. But how open?

Agree or disagree, but Google’s Gemini has been good to me. So, to not burn through tokens, it’s a Google Doc file, and Google Tasks, and maybe Google Keep ( still tinkering on that ).

I’m going to see if I can use the google workplace references to see what comes out, i.e. “@Google Docs, read my life log to summarize any trends or common entries from the past week.” or “@Google Docs, read my life log and outline any potential tasks. Format a list. @Google Tasks take that list, and add any of them that aren’t already in my inbox”.

I’ll let you know how it goes next time.

#Journal

Weeknotes 2026-06-05