AI in Focus is our ongoing livestream series for thoughtful product leaders who want to get practical with AI. In our first stream of 2025, Chad Pytel and Matheus Richard replace a locally running version of Llama with a local version of the DeepSeek R1 model in a real Rails app using Ollama and the ruby gem, ruby-openai. You can watch the full replay on YouTube
Why DeepSeek R1?
When DeepSeek R1 was released, it didn’t just make waves in the dev community, it nudged the stock market too. That’s because it’s a reasoning model with open weights, high performance, and relatively low hardware requirements.
In a world dominated by closed-source models, DeepSeek R1 represents a growing trend: more open, cheaper, and locally runnable models.
At thoughtbot, we believe that AI will follow the general trend of technology: getting faster, better, and more open over time. DeepSeek R1 is early but strong evidence of that shift.
Local-first AI development
In the stream, we walked through:
- How to run DeepSeek R1 locally using Ollama
- Why it’s important to be able to swap out models easily in development and test environments
- The difference between reasoning LLMs (like DeepSeek R1 and GPT-4) and traditional LLMs, and how R1’s
<think>
tags expose its internal reasoning steps in a way other models don’t
We saw for ourselves how this visibility into the LLM’s reasoning makes it easier to understand, debug, and improve prompts, which is a big deal for teams experimenting with building AI-powered features.
Testing out DeepSeek
In our previous AI in Focus livestreams, we developed a SummaryGenerator
class which used a locally running version of the Llama open-source LLM to create project summaries in one of thoughtbot’s in-house Rails apps.
For this episode we pulled a 32B version of DeepSeek R1 from the Ollama library, weighing in at around 20GB: large, but still runnable on a modern laptop.
Then, by changing one constant in our OpenAI Ruby client-backed Prompt class we were able to swap in and start using DeepSeek.
class Prompt
include ActiveModel::Model
MODEL = "deepseek-r1:32b".freeze
MAX_TOKEN_COUNT = 8000
OLLAMA_HOST = "http://127.0.0.1:11434".freeze
Our Rails app already had a background job that generated summaries of CRM opportunities using an LLM. By switching the model name, we immediately started generating those summaries using DeepSeek R1 instead of Llama.
Responsible AI
As always, we’re building in alignment with our AI Ethics Guide. We think AI should be trustworthy, explainable, and designed to serve real human needs. Not just buzzwords.
With this in mind, our Rails app tracks every summary generated by the LLMs and stores the full prompt used (including additional per-record instructions), creating a full audit trail. This helps us with transparency and human control, two of our key AI Ethics principles. Bonus: thanks to DeepSeek R1’s reasoning visibility, we not only logged the summary output but the model’s internal thought process in the API response via its <think>
tags.
So how did it do?
Granted, we didn’t run Llama and DeepSeek side by side (a suggestion from our audience for another live session), but we found that DeepSeek R1 performed just as well if not better at this summary generation task than Llama. It was accurate, with no hallucinations and it managed to accurately summarise the concurrent project duration, something Llama struggled with.
Faster, cheaper, and (maybe) greener?
One theme that came up in the livestream is how we expect AI to keep getting faster and cheaper. That’s exciting not just from a cost and speed perspective, but also because it connects to a bigger conversation happening in tech right now: AI’s energy usage.
A lot of people are asking, “Is AI too power-hungry to scale responsibly?” As these models become more efficient and accessible, we might see more sustainable AI patterns emerge alongside the product and cost benefits.
Internal tools first, public features second
At thoughtbot, we’ve also noticed a pattern across client projects: AI often starts with internal tooling, such as automating a workflow, assisting internal teams, and generating summaries or answers. These internal use cases are lower-risk and higher-trust environments, which makes them great for experimentation. When clients start to see consistent value, they start exploring how to extend those capabilities into customer-facing features.
We also had some fun
- We asked DeepSeek R1 to translate “It’s raining cats and dogs” into Portuguese. It reasoned its way to “chovendo a potes” (though it might be more common in Portugal than Brazil).
- We revisited an old challenge: counting the number of R’s in “strawberry.” Even with visible reasoning, our locally running 32B version of R1 twisted itself in knots and concluded the answer was 2, so a reminder to always verify the output of any LLM-generated content!
We hope you’ll join us for the next episode. Register here to stay in the loop and never miss a session of live coding on AI in Focus.
Exploring AI for your product?
If you’re considering how AI can enhance your product, whether it’s internal tooling or a customer-facing feature, get in touch with thoughtbot. We’d love to help you explore what’s possible and build something real.