AI-Assisted Recovery (using `SKILL.md`)

Two ways to use AI for recovery help

There are two approaches:

Use a normal chat bot that helps you follow the documentation.
Use an AI CLI (coding) tool that can run commands, including BTCRecover installation and execution.

For a chat bot, you can simply prompt with: "Can you help me recover a wallet using https://github.com/3rdIteration/btcrecover/blob/master/SKILL.md"

Or if you have downloaded and unzipped BTCRecover, you can select the folder using something like Claude and prompt: "Can you help me recover a wallet using the Skill.md from this repository

BTCRecover ships with a SKILL.md file at the root of the repository. It is a structured prompt that teaches an AI coding agent how to triage a wallet-recovery situation, install BTCRecover, take the user's system offline, collect wallet material, build a btcrecover.py / seedrecover.py command, and finish up — all while avoiding the common ways a user can leak their secrets to an online machine.

This page tells you where to put SKILL.md for the four AI agents the skill has been tested against (and a generic fallback for anything else), and gives the safety rules you should keep in mind no matter which agent you use.

⚠️ Important — read this before you start. AI agents will happily ask for your seed phrase, password, or wallet file. Never paste a real secret into a chat with a hosted / cloud agent. The SKILL.md workflow is primarily intended for a local agent (one running on your own machine), and secondarily for cloud agents under the split-workflow rules described in Step 4 of SKILL.md. If you're using a cloud agent (ChatGPT, Claude on the web, GitHub Copilot chat, etc.) make sure the agent only ever sees half the puzzle — for example password guesses, not the wallet file; or password guesses, not the mnemonic.

Quick start (any agent)

Clone or download this repository so you have a local copy of SKILL.md and the skills/ directory next to it. The main SKILL.md dispatches to sub-skills under skills/ for installation, building a passwordlist / tokenlist, and locating a wallet file on disk — they must be installed together.
Drop SKILL.md (and the skills/ directory) into the location your AI agent looks at (see per-agent instructions below).
Start a new chat / session and ask the agent something like "Use the BTCRecover recovery skill to help me recover my wallet." The agent will then follow the workflow in SKILL.md and walk you through triage, install, going offline, building a command, and running it.

If you have already downloaded the BTCRecover repository, you can simply open it in Claude Desktop, Claude Code, or VS Code + Cline and prompt with: "I want to recover a wallet, can you use the SKILL.md in this repo to help me?".

If your agent isn't listed below, the universal fallback is: open a fresh chat, paste the contents of SKILL.md as the first message (or as a system / custom-instructions message if the product supports it), then describe your situation.

Tested models

Tested local-models

These results were generated using the evaluation harness in the utilities/skill_eval/ folder.

Chat Mode Performance

Model	Avg % Ceiling	Typical Range (±1σ)
qwen3.6-27b	70.9%	67.5% – 74.4%
gemma-4-31b	67.5%	61.2% – 73.7%
gemma-4-26b-a4b	57.6%	53.6% – 61.5%
gemma-4-12b	54.2%	49.2% – 59.2%
qwen3.5-9b	52.4%	47.6% – 57.2%
gemma-4-e4b	30.8%	27.9% – 33.7%

Docker Mode (Tool-use) Performance

Model	Avg % Ceiling	Typical Range (±1σ)
gemma-4-31b	65.0%	60.7% – 69.3%
qwen3.6-27b	58.4%	52.3% – 64.6%
gemma-4-12b	51.7%	44.8% – 58.5%
qwen3.5-9b	47.5%	43.1% – 51.9%
gemma-4-26b-a4b	39.9%	34.9% – 44.9%
gemma-4-e4b	33.4%	29.1% – 37.8%

gemma-4-31b (Top performer overall - recommended if you have the VRAM)
qwen/qwen3.6-27b (Excellent for chat/triage; usable on 24GB+ GPU like a 3090, 4090 or 5090)
qwen/qwen3.5-9b / gemma-4-12b (Mid-tier; 9b usable on 8GB GPU like a 3070)
qwen/qwen3.5-4b / gemma-4-e4b (Low-tier; usable on most modern systems, but likely to struggle with complex logic)

Performance varies significantly by model size: the 31b/27b models are significantly more reliable for end-to-end recovery. Models in the 9b-12b range are often usable but can be inconsistent. Models below 9b (like 4b or e4b) often struggle to follow the full multi-step workflow and typically require the "staged approach" described below. Anything below 4b parameters just gets stuck looping.

It's also worth saying that this is one use case where MoE models seem to work well in chat (Asking it for commands to run and tips for using the tool) but struggle with tool use. (Where the LLM runs the commands for you)

Local LLM Settings to watch:

Set context length to at least 20,000 regardless of model, more is generally better for larger models. (You generally set this in your LMStudio or Ollama)
If you are using Cline, then enabling "Use Compact Prompt" will help it work much better as the default prompt includes about 10k tokens worth of stuff we don't need that just fills up the limited context and confuses our local LLM. (Particularly noticable on less capable models and systems)

Cloud Models (Claude, ChatGPT, etc)

Simply put, large cloud models should have no issue with this at all.

Tested with: * Claude Sonnet 4.6 * Claude Haiku 4.5

Both work fine. Opus will have no problem at all but you shouldn't need that level of reasoning.

Benchmarking models with `skill_eval_harness.py`

If you want a repeatable way to estimate how well a model follows SKILL.md, use the local evaluation harness in utilities/skill_eval/. See utilities/skill_eval/README.md for full setup and options.

Typical one-off run:

python utilities/skill_eval/skill_eval_harness.py \
  --candidate-model qwen3.5-9b \
  --candidate-base-url http://127.0.0.1:1234/v1 \
  --judge-model qwen/qwen3.6-27b

For repeatability, use --suite-config (start from utilities/skill_eval/example_suite.json) and queue multiple candidate runs using the same judge/scenario set. You can also test the same candidate against multiple skill_roots in one batch.

After a few passes, compare these fields from each results JSON:

meta.overall_score
meta.overall_score_percent.of_theoretical_max
meta.overall_score_percent.of_executed_turn_ceiling
per-scenario total_score and violation_tags

Those numbers make it easier to see whether a model is actually improving on script selection, safety sequencing, install handling, and split-workflow rules instead of relying on single-chat impressions.

Use with less capable agents (e.g. 9b and 4b models)

Some smaller models struggle when asked to do the full multi-step recovery workflow in one go. (Though may work fine, so it's worth trying first)

If they get stuck, run recovery as a sequence of short, explicit requests and only ask for one skill at a time.

Recommended pattern:

Keep each prompt narrow (one outcome only).
Wait for output, confirm it looks correct, then send the next prompt.
Explicitly name the skill you want used (install-btcrecover, build-password-tokenlist, locate-wallet-file).
Do not mix online brainstorming with offline secret-entry steps in the same prompt.
For first-run command building, ask the agent not to over-specify tuning flags (--threads, seed --typos / --big-typos) unless there is a clear case requiring them.

Example prompt sequence:

Main skill kickoff (triage only): "Use SKILL.md, run Step 1 triage only, and stop after you summarize what recovery path I should use."
Install skill only: "Use the install-btcrecover skill only. Detect my OS, check if BTCRecover is already runnable, and then give me only the exact next install commands."
Wallet file location skill only (if needed): "Use the locate-wallet-file skill only. Help me scan these folders and return candidate wallet paths with matched fingerprint type, without printing file contents."
Password/tokenlist skill only: "Use the build-password-tokenlist skill only. Help me create a tokenlist from my remembered fragments and propose conservative typo flags for a first run."
Command build only (main skill step): "Return to SKILL.md and do only Step 6: build the exact btcrecover.py/seedrecover.py command with placeholders, keep defaults, and do not add --threads or seed --typos / --big-typos unless I explicitly ask for expansion, then stop."
Execution only (offline machine): "Now do only the run/monitor step with the command we already built; do not redesign the tokenlist unless the run fails quickly."

This staged approach usually improves reliability with lower-capability models and makes it easier for you to verify each step before continuing.

Claude Code (Anthropic's terminal coding agent)

Claude Code automatically discovers project-level instructions from a CLAUDE.md file in the working directory, and discovers reusable "skills" from a .claude/skills/ folder.

Recommended setup:

Project-scoped (preferred for one-off use): From inside your local BTCRecover checkout, just start Claude Code with claude — it will pick up SKILL.md from the project root because the file is referenced from AGENTS.md / README.md. You can also explicitly tell Claude "Follow SKILL.md in this repo".
User-scoped (so the skill is available in any directory): copy SKILL.md to ~/.claude/skills/btcrecover-recovery/SKILL.md (create the directory if it doesn't exist), and copy each sub-skill under the repository's skills/ directory to its own folder under ~/.claude/skills/ (e.g. ~/.claude/skills/install-btcrecover/SKILL.md, ~/.claude/skills/build-password-tokenlist/SKILL.md, ~/.claude/skills/locate-wallet-file/SKILL.md). Claude Code will then offer the BTCRecover recovery skill from any project and the sub-skills will be discoverable by name when the main skill delegates to them.

When recovery involves real secrets, run Claude Code on the offline machine (or on a separate machine from the wallet file — see Step 4 / 4a in SKILL.md).

GitHub Copilot (VS Code / JetBrains / Visual Studio)

GitHub Copilot picks up repo-specific instructions from .github/copilot-instructions.md and from AGENTS.md at the repo root.

Recommended setup:

Open your local BTCRecover checkout in VS Code (or your supported IDE) with GitHub Copilot Chat enabled.
AGENTS.md already points Copilot at SKILL.md — open Copilot Chat and ask it "Help me run a BTCRecover recovery using SKILL.md."
If you want the skill available in every repo you open, copy SKILL.md into your user-level Copilot custom instructions (Settings → Copilot → "Custom instructions") or into .github/copilot-instructions.md of the project you typically work in.

Treat Copilot Chat as an online agent (it talks to GitHub's servers). Follow the Step 4a split-workflow rules in SKILL.md: brainstorm passwords and build the command with Copilot online, then swap the mnemonic / wallet file in on your offline machine.

ChatGPT (OpenAI, web or desktop app)

ChatGPT doesn't read files from your disk automatically, so you load SKILL.md into the conversation instead.

Two good options:

Custom GPT (recommended if you'll use this more than once). Create a new GPT in Explore GPTs → Create. In the Instructions box, paste the full contents of SKILL.md. Optionally upload the BTCRecover repository (or just the docs/ folder) as a Knowledge file so the GPT can reference the linked documents. Use that GPT whenever you do a recovery.
One-off chat. Start a new conversation, paste the contents of SKILL.md as the first message, then describe your situation.

ChatGPT is a cloud agent. Do not paste your real seed phrase, real password, or wallet-file contents into the chat. Use it to draft your passwordlist / tokenlist and the suggested command (with placeholders), then run the command on your offline machine. For wallets supported by the extract scripts, you can run the extract script on the wallet-holding machine and paste the safe "data extract" back into the chat — that extract is designed to be safe to share.

Cline (VS Code AI agent)

Cline reads project-level instructions from a .clinerules file at the root of the workspace and a global rules file in your home directory.

Recommended setup:

Per-project: open your local BTCRecover checkout in VS Code with the Cline extension installed. Cline will see AGENTS.md and SKILL.md automatically. Start a task with "Follow SKILL.md to help me recover my wallet."
Global: copy SKILL.md to ~/.clinerules (macOS/Linux) or %USERPROFILE%\.clinerules (Windows) so the skill applies in every workspace. You can also keep both — project rules override global rules for the project you're in.

Cline can run locally against your own model (e.g. via Ollama or LM Studio), which is the safest option for recovery work. If you're pointing Cline at a hosted model, treat it like ChatGPT above and stick to the split-workflow.

Any other agent (generic instructions)

Most AI assistants will accept SKILL.md either as a "system prompt", "custom instructions", "project rules", or simply by pasting it as the first message in a fresh chat. The key requirements are:

The agent sees SKILL.md before you describe your situation.
The agent has access to a terminal where it can run git, python, btcrecover.py, and seedrecover.py (or it gives you commands to run yourself).
You can identify whether the agent is local or cloud-hosted, so you can apply the right safety rules from Step 4 / 4a of SKILL.md.

Contributing fixes back

If the AI agent had to fix a bug in BTCRecover or add a new feature during your recovery (for example to support a wallet variant that wasn't quite handled correctly), please consider sending that improvement back so other users benefit:

Preferred: open a pull request against https://github.com/3rdIteration/btcrecover/.
Or: email a short bug report with the fix attached (e.g. a git diff or the modified files, plus reproduction steps that contain no secrets) to steve@cryptoguide.tips.

Step 8 of SKILL.md asks the agent to prompt you about this once your funds are safe.

AI-Assisted Recovery (using SKILL.md)

Two ways to use AI for recovery help

Quick start (any agent)

Tested models

Tested local-models

Chat Mode Performance

Docker Mode (Tool-use) Performance

Local LLM Settings to watch:

Cloud Models (Claude, ChatGPT, etc)

Benchmarking models with skill_eval_harness.py

Use with less capable agents (e.g. 9b and 4b models)

Claude Code (Anthropic's terminal coding agent)

GitHub Copilot (VS Code / JetBrains / Visual Studio)

ChatGPT (OpenAI, web or desktop app)

Cline (VS Code AI agent)

Any other agent (generic instructions)

Contributing fixes back

See also

AI-Assisted Recovery (using `SKILL.md`)

Benchmarking models with `skill_eval_harness.py`