How to Migrate a Codebase from Python to Go Using an AI Coding Assistant
by Dr. Phil Winder , CEO
I recently migrated a production Python codebase to Go using Claude Code as the primary coding agent. The codebase was Kodit, an MCP server and CLI tool for indexing code repositories. It had a well-structured Python backend with a FastAPI server, SQLAlchemy ORM, Pydantic models, and a search pipeline combining BM25 and vector embeddings.
Why migrate? The main reason was to incorporate Kodit as a library into Helix, to provide greater integration. But another really important reason was that initially I had thought that Kodit itself would manage and operate the models that are required by it. But in actual fact, most people delegated embedding and generation requests to an external API. This meant that I didn’t need any NVidia or HuggingFace dependencies and I sorely missed the production speed and quality of Go. Go offered better deployment simplicity (single binary), stronger type safety, and improved performance for the search-heavy workload. But these reasons are a secondary consideration. The real question is: can an AI coding assistant do the heavy lifting of a cross-language migration?
The answer is yes, with caveats. This article is a field report on the methodology, the automation, and, most importantly, what went wrong.
The Methodology
AI is only as good as the instructions it receives. Before generating a single line of Go, I invested the majority of my time structuring the migration project and writing two design documents.
The setup is a monorepo:
migration-project/
├── python-source/ # Original Python codebase
├── go-target/ # New Go implementation
├── CLAUDE.md # Migration instructions and domain context
└── MIGRATION.md # Checklist and progress tracking
CLAUDE.md is a design document that encodes everything the AI needs to know. Domain context, bounded contexts, a ubiquitous language glossary, Python-to-Go translation rules, coding standards, project structure, and workflow constraints. Think of it as the technical specification you would hand to a new team member before they touch the code.
MIGRATION.md is an ordered task list with dependency tracking. Each item maps a Python source file to its Go target, lists what must be migrated first, and includes verification checkboxes. This file serves a dual purpose: it gives the AI stateless resumption across sessions (it reads the file at the start of each run to determine what to do next), and it gives the human a dashboard of progress.
The upfront investment in these two files is the single most important factor in the success of this migration. Without them, the AI generates plausible but wrong code. The source repository is a vital encoding of months or years of domain knowledge. We need to preserve this knowledge in the design documents and hopefully leverage it to guide the AI.
Phase 1: Discovery
Before writing the design files, I ran a structured discovery process. This is where you teach the AI your domain.
I used a sequence of targeted prompts, each focused on a single concern:
- Codebase scan: total file count, directory structure, entry points, test setup, key dependencies
- Bounded contexts: identify logical groupings that map to domain boundaries
- Ubiquitous language: extract domain vocabulary from class names, method names, docstrings, and test names
- Pattern mapping: catalogue Python-specific patterns (decorators, ABCs, exception hierarchies, duck typing) with suggested Go equivalents
- Dependency mapping: map every external library to its Go counterpart
- Repository analysis: understand the data access layer in detail
- Event mapping: find domain events, publishers, and handlers
I tried both a one-shot prompt to generate the design files directly and a two-pass approach. The two-pass approach worked better: run each discovery prompt individually, dump findings into a notes file, then generate CLAUDE.md and MIGRATION.md from the accumulated notes. The one-shot attempt produced shallower results because the AI was trying to both discover and synthesise simultaneously.
A critical point: the discovery phase is where you teach the AI your domain. Skip this and you get generic, structurally correct code that misses the nuances of your system. It will likely make the same mistakes and subsequent fixes that were made in the source repository. Invest here and the code generation phase becomes dramatically more reliable.
Phase 2: Design Files
With discovery complete, I generated the design files. The key sections of CLAUDE.md were:
Translation rules. A table mapping Python patterns to their Go equivalents. Classes to structs, @dataclass to structs with NewFoo() constructors, try/except to if err != nil, list comprehensions to loops with append. Having these rules explicit prevents the AI from improvising different translations across files.
| Python | Go | Notes |
|---|---|---|
class Foo(Base): | Embed Base or implement interface | Prefer composition |
@dataclass | type Foo struct {} + NewFoo() | Constructor returns error for validation |
raise ValueError() | return fmt.Errorf(...) | Wrap with context |
try/except | if err != nil {} | Check immediately |
Optional[T] | *T or T, bool return | |
list comprehension | Loop with append |
Coding standards. I included object-oriented design principles: name classes for what they are not what they do, prefer composition over inheritance, keep interfaces small, make classes immutable. These constraints steer the AI away from generating sprawling structs with dozens of methods.
Project structure. The exact directory layout, layer rules (domain has no infrastructure imports), file naming conventions, and package naming. This section eliminates an entire class of structural decisions the AI would otherwise make inconsistently.
Migration workflow. Explicit instructions for maintaining MIGRATION.md: check the file before starting, mark items complete after finishing, update the session log. This turns the AI into a disciplined worker that tracks its own progress.
The design files are living documents. I updated CLAUDE.md several times during the migration as I discovered patterns the AI was handling poorly or conventions it was not following.
Phase 3: Automated Migration
I started with a manual pass. I ran a single migration task interactively, reviewed the output, corrected the design files where needed, and verified the workflow. This first pass is essential. It confirms the AI understands your patterns before you let it run unsupervised.
Once satisfied, I automated the loop with a bash script:
#!/bin/bash
set -e
PROMPT='Continue the Python-to-Go migration. Follow this workflow:
1. Read MIGRATION.md and report current phase and next task
2. For each task: read Python source, write Go test first,
write implementation, verify with go build/test/lint
3. Mark checkbox complete in MIGRATION.md
4. Before stopping, update the session log
Start by reading MIGRATION.md and giving me the Session Start Report.'
ITERATION=0
while true; do
ITERATION=$((ITERATION + 1))
echo "=== Migration Loop - Iteration $ITERATION ==="
claude --print "$PROMPT" || {
echo "Claude exited with error. Retrying in 10s..."
sleep 10
}
if ! git diff --quiet || [ -n "$(git ls-files --others --exclude-standard)" ]; then
git add -A
git commit -m "migration: automated session $(date '+%Y-%m-%d %H:%M:%S')"
fi
sleep 5
done
The script runs Claude Code in --print mode (non-interactive), auto-commits after each session, and loops. MIGRATION.md provides stateless resumption: each new session reads the file, finds the next unchecked task, and continues from where the previous session left off.
After roughly two hours of unattended execution, the AI had completed all tasks in the migration checklist. Thousands of lines of Go code, generated from Python, with tests.
Then the real work began.
What Went Wrong
This is the most valuable section. The migration script completed successfully. The tests passed. The code compiled. And the system did not work correctly. Here is what I found.
Dead Code Accumulation
During refactoring, the AI leaves orphaned code behind. Functions that are no longer called, types that are no longer used, entire files that serve no purpose. The problem is insidious because Go’s import system masks it: dead internal code is imported by other dead internal code, creating little islands that appear to be in use.
I discovered this halfway through the project when I refactored the code from the internal/ package to a public API. For days afterwards I was finding orphaned code. When I eventually asked Claude to check for dead code itself, it found more. This is a recurring problem with AI-generated refactoring and it needs tooling support, not just vigilance.
Architectural Drift
The AI placed everything in Go’s internal/ directory because that is the idiomatic default. But I needed a public Go API so that other projects could import Kodit as a library. I had not specified this in the initial design. The lesson is simple: be explicit about public API design upfront. The AI will follow the path of least resistance, which is usually the language’s default conventions.
Or in other words, this is another case of “don’t ask, don’t get.”
Context Window Limits
Some refactoring tasks are inherently large. When I asked Claude to add pagination to all list endpoints simultaneously, it missed half of them on the first attempt. On the second attempt it ran out of context window and started compacting, losing track of the remaining work. In one case it even created code that listed all records then paginated in memory.
The solution is to break cross-cutting concerns into smaller, explicit tasks with a plan. Claude Code has an effective planning tool, and it can run discrete sub-agents to execute that plan. Another way is to maintain a to-do list and ask the AI to check it off as it completes tasks.
Phantom Features
This was the most frustrating failure. In an earlier version of the Python codebase, I had deprecated a snippets table by merging its data into a unified enrichments table. But remnants of the old schema remained in the code: references, comments, type hints. The AI saw these, recognised the domain importance of “snippet”, and rebuilt the entire feature using the old database design.
I only discovered the problem when I tested migrating real data from the Python version to the Go version. Searches returned zero results because data was being written to the wrong table. The fix required a substantial refactor touching a large number of files, because “snippet” is a core domain concept.
The lesson: clean up dead references before migration. If deprecated code or schema references exist anywhere in the codebase, the AI will find them and use them.
Missing Integration Tests
Unit tests passed. The code compiled. But the system did not work end-to-end. The AI never ran the application as a whole. It verified each component in isolation but never wired them together and hit the API.
When I eventually ran the full system and compared search results between the Python and Go versions side by side, I found the Go version was returning completely wrong results. The causes were a combination of: incorrect truncation when converting embeddings to VectorChord format, an indexing error in the reciprocal rank fusion output, and the AI reading the wrong embedding from disk (it had added functionality I had not asked for).
Interestingly, during this debugging Claude noticed that the system was using L2 distance rather than cosine distance for vector search, which was probably affecting results in the original Python version too. A good example of how AI assistants can surface subtle issues that humans miss.
What I Would Do Differently
If I were to run this migration again, I would change the following:
- Mandate smoke tests from the start. Add a section in MIGRATION.md requiring the application to be started and the APIs to be manually (or automatically) tested after each major phase. Unit tests alone are not enough.
- Define the public Go API before generating any code. Decide which packages are exported, what the client interface looks like, and document this in CLAUDE.md. Do not let the AI default to
internal/for everything. - Add configuration management rules. Specify that configuration should be set, defaulted, validated, and logged in a single place. Without this constraint, the AI scattered defaults and overrides across downstream code.
- Include an OpenAPI spec comparison step. I eventually added a test to verify parity between the Python and Go API specifications. This should have been there from the start.
- Run a data migration test early. Testing with real data from the old system, not just synthetic test data, would have caught the phantom snippets problem weeks earlier.
- Prompt the AI to check for dead code after each refactoring session. Make this an explicit step in the workflow, not something you remember to do after the damage is done.
- Design the Go client API as a first-class concern. The AI tended to create massive structs and deep store-level accessors because the client was an afterthought. Copy well-established Go client library patterns and make them a priority in the design.
Conclusion
An AI coding assistant can generate thousands of lines of correct, idiomatic Go from a Python codebase. The translation rules, pattern tables, and structural conventions in the design files produce remarkably consistent output.
But you need to write design documents, review architecture, and build the right test harness. The critical insight is that the AI is a powerful but literal executor: fast, tireless, and incapable of questioning whether the instructions are complete. It will follow your design to the letter, which means the gaps in your design become the bugs in your system.
Treat the AI as a capable junior engineer. Give it clear instructions, verify its work at the boundaries, and invest your time where it cannot: in the design decisions that shape the whole system.