AI in Focus: A new Claude Skill for Rails code audits

https://thoughtbot.com/blog/ai-in-focus:a-new-claude-skill-for-rails-code-audits

AI in Focus is our ongoing livestream where we get hands on leveraging AI in real product development. In this episode, Chad flew solo and picked up a thread from our previous episode Claude Code skills for FDA-style documentation. We had touched on the idea of using Claude skills for code quality and auditing, and fellow thoughtbotter Jose Blanco ran with it to ship an experimental Claude skill rails-audit-thoughtbot.

In this stream, Chad installed Jose’s experimental Claude skill and put it though it’s paces auditing our internal operations platform, which is a 10+ year old Rails codebase, with plenty of legacy surface area. You can watch the full replay on YouTube or read on for what we learned about auditing Rails apps using Claude skills.

What are Claude Skills?

Agent Skills are an open, file-based format maintained by Anthropic and open to contributions from the community. At its most basic, a skill is a SKILL.md file with YAML front matter, and a markdown body of instructions. Anthropic has introduced skills across Claude, and Claude Code looks for them in your repo or on your local machine, usually under ~/.claude/skills/.

It’s a bit like human-authored tool calling: The short front matter stays lightweight, and the content only loads when the skill is relevant, or if you invoke it explicitly.

What goes into rails-audit-thoughtbot?

Jose’s new experimental skill encodes thoughtbot-flavored Rails best practice. Let’s allow the code to speak for itself. Here’s the YAML from matter at the top of rails-audit-thoughtbot’s SKILL.md:

---
name: rails-audit-thoughtbot
description: Perform comprehensive code audits of Ruby on Rails applications based on thoughtbot best practices. Use this skill when the user requests a code audit, code review, quality assessment, or analysis of a Rails application. The skill analyzes the entire codebase focusing on testing practices (RSpec), security vulnerabilities, code design (skinny controllers, domain models, PORO with ActiveModel), Rails conventions, database optimization, and Ruby best practices. Outputs a detailed markdown audit report grouped by category (Testing, Security, Models, Controllers, Code Design, Views) with severity levels (Critical, High, Medium, Low) within each category.
---

As you can see, the long description is what Claude Code uses to decide when the skill should be applied; the rest of the file stays unloaded until the skill runs.

In developing this skill, Jose drew on our free books Ruby Science (code smells and fixes), Testing Rails, as well as a security checklist (12 categories with detection patterns), and a report template so the audit output is structured and comparable between runs. Following the feedback from this livestream, he added references/rails_antipatterns.md, a reference drawn from Chad’s book Rails Antipatterns that extends the audit with external services, migrations, performance, and failure-handling patterns.

The skill is open-source and available at thoughtbot/rails-audit-thoughtbot. Feel free to clone or copy it into your skills directory and get ready to experiment.

Installing and running the skill

Chad started from a clean checkout of Hub: He created ~/.claude/skills, then cloned the rails-audit-thoughtbot skills repo into that directory. With a Claude Code session running in the terminal, after some debugging, Chad was able to call the skill using its name slash-command:

/rails-audit-thoughtbot

Tip from the stream: you do not have to audit the whole app on day one. The skill supports a targeted run inside a session, for example:

/rails-audit-thoughtbot audit controllers

or

Do a code review on my models

What the Hub run looked like

On a full-application pass, the skill took ~5 minutes to generate an audit report that summarized findings grouped by category (testing, security, models, controllers, code design, views) with assigned severities. Chad highlighted a few headline items:

  • Large Model: a textbook god class on Person which runs to hundreds of lines and many public methods, aligning with what the team already knew from past code reviews.
  • PORO: many service-style objects flagged for refactor toward ActiveModel-backed domain objects, per intentionally-opinionated PORO.
  • Missing model specs The skills treated “no _spec per model file” as a gap but Chad dug in and noted this was a bit of a false-positive as view-model style objects had integration coverage instead.
  • Positives: strong testing culture, good use of calculators, presenters, and query objects, no :focus tags in RSpec (Chad recommends against committing :focus: use line-number runs instead e.g. .\spec_name.rb:34).

Where human judgment still matters

The audit was promising but not perfect. A couple of areas where the skill could be improved included:

  • Composition vs. wider design: recommendations leaned on extracting small objects with composition, which aligns with Ruby Science. However, for real refactors, Chad reminded us to ask whether the leverage is alongside or above a hot spot before following the first suggested carve-out.
  • “Issues” that are really clean bills of health: for external API wrappers, the skill correctly said “no changes required,” yet still surfaced them under severity report styling meant for problems. Chad noted that for onboarding to a new codebase, calling out “we looked here and it’s fine” is valuable, but the skill’s report template could be updated separate passes from findings so readers are not misled.

Throughout, the audit was pretty consistent with how we begin client audits at thoughtbot: Move fast for orientation, then validate each area with deep domain knowledge.

How a thoughtbot services audit goes deeper

The stream’s run was overwhelmingly static analysis using Claude Code, which finished in ~5 minutes. Compared with 11–13 minutes for Hub’s full test suite alone, this was a fair tradeoff for a first pass. Chad contrasted the skill with what we typically do on client Rails audits: Run diagnostic tools like rails-flog, flay, Bullet for N+1s, execute the test suite, and attach real coverage metrics, folding all of that into the written recommendations. Following this stream, Jose picked up on Chad’s recommendations and added agentic support for running SimpleCov, and RubyCritic, which rolls-up rails-flog, flay and other analytics tools.

Chad also imagined CI hooks down the road: Not to block merges blindly, but to keep a living quality guide next to our style guide, maintain expert human oversight.

Test it out yourself

If you use Claude Code and maintain a Rails app, clone rails-audit-thoughtbot, run a full or targeted audit, and see how the report reads on your codebase. We would love reactions on the YouTube replay as well as issues and PRs on GitHub.

rails-audit-thoughtbot is experimental. It is a starting point, not a finished product. We actively encourage open-source contributions: Open an issue or fork the repo and open a pull requests with ways you think this Claude skill can work better for your team.

Get in touch

If your Claude Code skill audit surfaces more than your team has capacity for right now, get in touch with us. We have been auditing and reviewing Rails apps for a long time, and are excited to help prioritize what actually moves the needle for your business.

About thoughtbot

We've been helping engineering teams deliver exceptional products for over 20 years. Our designers, developers, and product managers work closely with teams to solve your toughest software challenges through collaborative design and development. Learn more about us.