I built an end-to-end testing tool for Obsidian in about two hours. It's tailored to my exact workflow, it knows the quirks of my plugin, and it gets better every time I use it.
I'm not going to share it though. That's the point of this post. You should build your own. You can even share this post with Claude as a starting point.
I've been building an Obsidian plugin called [Relay](https://relay.md) for collaborative knowledge management. Testing it used to mean clicking around manually, or wrestling with heavyweight testing frameworks that don't understand Obsidian's internals.
Now I have a tool that can take screenshots, click buttons, type text, and verify UI state. When something breaks, I tell it to fix itself. I haven't read the code because this is all in one-shot territory for Claude Opus 4.5.
## The magic word
The thing you need to know in order to build Obsidian E2E testing is the name of the protocol that playwright and other browser testing tools use. Obsidian is built on electron, so it runs on chrome. That protocol is called the *Chrome Debug Protocol (CDP)*.
You'll need to start Obsidian with CDP enabled.
Here's how you do that on Linux
```
./Obsidian.AppImage --remote-debugging-port=9222
```
Then, I opened Claude Code and said:
> *"create a skill that uses the chrome debug protocol to drive end-to-end tests in Obsidian"*
Claude Opus 4.5 can one shot a Python script that connects to Obsidian's debug port, take screenshots, click at coordinates, and type text.
The initial version was too "Chrome-y" -- all the documentation talked about browsers. I wanted something that felt native to Obsidian.
> *"Make it clear this connects to a running Obsidian instance. Focus on the Relay plugin so I don't have to type out the full plugin path every time."*
Now clause can run `relay status` to get plugin state. `relay folders` for shared folders. `relay file-status myfile.md` for sync state. The skill and cli tool are context efficient.
This is where personal software diverges from general-purpose tools. A generic testing framework doesn't know that I care about sync status. Mine does.
## Iterative Improvements
I asked Claude to write content into a note for testing. Simple enough, but i forgot my editor was in vim mode.
The tool failed -- it was inserting text directly instead of sending keystrokes. Vim didn't capture the input.
> *"We need a way to send a key sequence. The type tool inserts text. Write a new tool to send keyboard events."*
Fixed. Now `type` inserts text, and `keys` sends actual keystrokes. `keys "ggdG"` deletes everything in vim. The tool learned the distinction because *I* needed the distinction.
I could have disabled Vim mode. I would have if I was using playwright... but this is more fun.
---
Claude tried to click on a menu item. Missed. Tried again. Missed again.
> *"Add a function that shows what element is under a given coordinate. Before we click, verify we're clicking the right thing."*
Now the tool has `element-at 250 400` so it can verify what is at that position before clicking. I baked in the lesson: always verify coordinates.
---
The scroll command was confusing Claude. Claude kept mixing up postive/negative delta values.
> *"Let's make this less ambiguous. Make the script have scrollUp and scrollDown."*
---
Claude connects to the wrong Obsidian instance.
> *"perhaps you connected to the wrong instance..."*
*"no, the right one is live1"*
(metrics work happens)
> *"great, now did you update the skill yet?"*
## Try it
You can do this in an afternoon. Not my exact tool -- *your* tool.
The thing you build will be rough. It will have bugs. It will fail in weird ways. But it will be *yours*, and every failure is a conversation away from being fixed.