Return to site

Drupal (AI) Playground: Building a Module

March 30, 2026

Falling in the playground

Using the metaphor of a playground for my AI Drupal development environment now feels completely fitting, based on my experience building a module using AI. Good playgrounds have a variety of structures that challenge kids of different ages and confidence levels, helping them develop their physical and social skills.

For example, most kids don't just run into a playground and immediately climb to the top of the monkey bars as their first move; yes, some daredevils will go straight there, and foolish ones will cry for help if they get stuck. My specific playground experience with AI was learning how to fall, get up, and try again. My obstacle was building a module using Claude Code. Similar to kids trying their first climb on the monkey bars, they expect to reach the top effortlessly, but as they climb, they face reality, their hands get sweaty, and they look down.

Unrealistic expectations

I had glorious expectations for my experience building a fairly complex module with Claude Code. I assumed that a fully documented module specification plan would guide Claude in creating a working solution.

Personally, I am not very skilled at writing requirements, specifications, and documentation. At best, I excel at writing self-documenting code, which is somewhat of a cop-out. For me, having a complete plan in place before starting implementation feels like a refreshing change. Creating better plans for AI coding agents will help me become a better mentor to humans.

Prompting a comprehensive plan

I wrote my module specification using Claude Chat. In my previous post about experimenting with agent skills, I shared an example module specification template.

So I provided Claude with a very simple prompt:

Assist me in creating a module to display all entity and field labels, export them to CSV, enable editing by users, and then allow uploading and importing the CSV. Use the attached example module specification template.

Claude and I spent several days refining the plan to develop a comprehensive module spec. Claude did most of the work with me, occasionally contributing a design or implementation pattern; therefore, it makes sense for Claude to summarize the plan and share it here.

The entity_labels module is a Drupal 10/11 contributed module that provides a Reports page for viewing, exporting, and importing entity type and field labels via CSV.
 

It has two primary tabs — Entities (bundle labels/descriptions/help text) and Fields (field labels across all bundles, including base fields) — each with drill-down routing, scoped CSV export, and CSV import for bulk-updating config translations. Optional support is included for the field_group and custom_field (4.x) contrib modules. The build follows strict Drupal coding standards with full DI, typed signatures, and PHPUnit test coverage.
 

-- Claude

As we all know, things never go exactly as planned.

Having a plan is not equal to a working solution

I expected a "one-and-done" prompt from my module spec that generated working code. Claude Code was eager to start working on the module, and it created all the expected files. All interfaces and services were implemented, and the unit tests technically passed. However, when I manually tested the module, it did not work as expected with some fatal errors.

Guess what, I got results similar to overloading a developer with too many features and expectations, where if one thing goes wrong, everything falls apart.

AI loves to pass tests and knows how to cheat

Tests are the best way to push an AI to verify its work, but AIs inherently want to succeed; they will cheat to pass their tests. In the Claude Code case, the unit tests passed because Claude mocked the entity types that would be exported and imported. The reality is that the entity export code was not loading the correct entity definition and, as a result, the code was not targeting the proper entity properties. For Drupal developers, Claude was attempting to update the 'node_type' entity but was using the 'node' entity definition; therefore, nodes use the 'title' property for the entity's label, while node types use the 'name' property. If Claude had written kernel tests, they would have tested actual entity types and identified the correct entity definitions to use.

The solution was straightforward: have Claude convert all unit tests into kernel tests, which eliminated the mocked entities, so that Claude would now identify any entity-related issues because it would be working with real entities. Using the right test type, Claude immediately fixed the issue.

My key lesson was that AI will find a way to "succeed" even when it is on the wrong path.

A minor mistake can explode into a mess

I shared a working code example for Drupal-based CVS file upload with Claude Code, but I forgot to mention that it depends on the File module being installed. During a functional test, Claude attempted to test the file upload feature, but it failed because the File module's API was unavailable. Instead of adding the File module as a dependency to the Entity/Field Labels module, Claude chose to implement file upload using Symfony's upload functionality, and it worked. Still, this isn't the right solution, and once again, a simple prompt telling Claude to add the File module as a dependency and use Drupal's File upload API began to untangle the mess.

Untangling the mess

I initially expected too much from my "one-and-done" prompt. I adjusted my expectations to the reality that AI codes like a junior developer, and both need achievable goals they can complete before moving on to the next task. After this initial setback, I rolled back a few features, including base field support and multilingual capabilities, to finally get a working POC module. Exhausted and somewhat discouraged, I uploaded the module to Drupal.org and took a day off from my project.

My plan was to revisit the challenge of having AI build a module using a different approach.

Changing my approach and LLM

Of course, while having Claude Code build and fix the module with a huge spec that filled the context window, I ran out of tokens. James Abrahams (yautja_cetanu) mentioned that he was getting good results with Codex without hitting usage limits. I was hesitant to try something new when I had just gotten started, comfortable with Claude Code. At the same time, I see that with AI code generators, you can ask them how to help you onboard and start using the tool. Right now, I’m only using the $20-per-month version of Claude Code, and upgrading to the $100 tier seems like a big commitment. Spending another $20 on a Codex subscription seemed like a sensible alternative.

Letting OpenAI's Codex enter my Drupal (AI) playground

Adding OpenAI's Codex to my playground felt like hiring another consultant to take over after the first one failed. Codex immediately recognized my existing agent skills. To maintain a single source of truth, I renamed CLAUDE.md to AGENTS.md and then created a symlink from CLAUDE.md back to AGENTS.md. Codex's UI and UX closely resemble Claude Code; I suppose Codex's copying of Claude Code UI/UX is a form of flattery.

One of my first priorities was to have Codex review Claude's tests, which led to improved test coverage and fixes. I then asked Codex to suggest improvements, and it immediately recommended adding multilingual support, base field support, and finally Drush support. This time, instead of executing a single massive "one-and-done" plan, I decided to let Codex write a plan for each feature and share it on Drupal.org in dedicated issues.

Nudging Codex to contrib back to Drupal

While exploring Kanopi Studios' CMS Cultivator skills, I found Matt Glaman's drupalorg-cli tool, which provides CLI commands to fetch information and code from a Drupal.org project's issue queue and GIT repository. The main, very reasonable limitation of the drupalorg-cli tool is that it cannot create a new issue or a code fork for an issue via an API call. I generated the issue summary using the Codex, manually created the issue, and then forked the code. In Codex, I prompted it to use drupalorg-cli with the issue URL to check out the issue fork, rereview the issue's details, and begin working on the code.

Codex's contribution back to Drupal

Below are the three features I asked Codex to plan and build.

I had Codex create a plan by copying and pasting the issue summary template from Drupal.org. Of course, in hindsight, I should have used Kanopis' Drupal.org Issue Helper skills. I'm cautious about adding new skills and accidentally consuming extra tokens. I might need an agent skill to help my agent find new Drupal skills.

Codex vs Claude Code

I can't say that one is better than the other, but I see a lot of value in using both.

I did notice Codex avoiding overly long methods with nested loops and if/then statements. For example, when I used Codex to add a language switcher, it did so using a dedicated ::buildLanguageSwitcher() method that isolated its changes, making the new code easier to review and understand.

I had to do substantial manual cleanup of the initial mess created by Claude Code due to an unrealistic "one-and-done" prompt. Most of my cleanup focused on improving code clarity and reusability. My refactoring was greatly aided by the fact that the module had working tests. I want to emphasize that, toward the end of my experience, I had to manually tweak the generated code less and less.

I am starting to see that some experienced coders are admitting they review less of their agent's code and have shifted to having agents review each other's code. I specifically found that having both Codex and Claude Code review each other's test coverage prevents an AI tendency to just make the tests pass without adding meaningful coverage. Even when using the same agent, I would clear the current session/context and ask the agent to re-review its own work.

Wrapping up Codex's contribution to Drupal

Every Drupal.org project needs a project page, and a project should also have a logo. Instead of simply converting the module's README.md to HTML, I decided to experiment with creating a dedicated agent skill, called drupalorg-project-page, to build a project page that leverages components from the Drupal.org Bluecheese theme. I also added another agent skill, called drupalorg-project-logo-prompt, to generate prompts for LLMs that support image generation, enabling the creation of logos based on specifications posted on Drupal.org. I'm a little skeptical of LLM-generated logos, but Google's Nano Banana did an okay job creating a logo. For the project page, the most important thing I asked Codex to do is call out that the module was created via AI.

Never submit code you don't understand

Dries Buytaert, the creator of Drupal, recently shared this advice. I believe it should become one of the main rules for the Drupal community regarding AI-generated contributions to Drupal. My belief, especially when starting to work with AI-generated code, is that a developer should carefully review every line and make manual adjustments as needed. Even though it felt like I was pair-programming with Claude Code and Codex, I have to accept that they are just tools.

Coming back to reality

At some point during this post, I decided to treat Codex as a co-creator of the Entity/Field label module. The reality is that Codex and Claude Code are very powerful tools that helped me contribute to Drupal faster and better than I expected.

Expect the unexpected with AI

I could share a long list of takeaways from my experience of building a module using AI. I believe the key takeaway is that people should try experimenting with AI and be prepared for the unexpected.

Returning to my playground metaphor, there's a big difference between watching someone swing back and forth and actually getting on the swing, learning how to swing, falling off, brushing yourself off, and getting back on.

Without further ado, here is the Entity/Field Labels module

Even though this post is about my experience building the Entity/Field Labels module using AI, I’d like to share with you the final result, which I hope you find useful. Feel free to help me improve this module with or without your AI assistant.