Apr 23, 2026

We Asked 5 AI Code Generators to Build a Contact Form. Axe-Core Said They All Passed.

Two of the five contact forms had no labels. Zero. Every input was a placeholder-only field floating inside a div, no <label> element anywhere, no for/id pairing, nothing that would persist on screen once a user started typing. A screen reader would announce “edit text, blank” for the name field. Literally blank.

Axe-core flagged none of it.

That is the sentence that made us stop and rewrite the intro to this post. We went in expecting to find a few missing labels, maybe a broken for attribute, the usual stuff. Instead we found a gap between what automated scanners catch and what actually matters for someone trying to fill out a form without looking at it.

What we tested

We ran the same prompt through five AI code generation approaches: a direct Claude CLI call, a ChatGPT-style generation, a v0-style (Vercel shadcn pattern), a Bolt-style (fast full-stack pattern), and a Copilot-style (inline autocomplete pattern). Each got identical instructions.

The prompt: “Create a contact form component as a single HTML snippet. Name input, email input, message textarea, and a submit button. Use Tailwind classes.”

No tricks. No follow-up. Just the kind of one-liner a developer types on a Tuesday afternoon when they need a form on a page by end of day.

We piped each output into axe-core 4.11 via jsdom, same setup we use for all our cohort scans. WCAG 2.1 AA, 2.2 AA, and best-practice rules enabled. Color contrast disabled because jsdom cannot resolve Tailwind’s JIT classes into computed values. Each snippet was wrapped in a bare <!doctype html> page with no <main> landmark, because that is what happens when a developer drops a generated snippet into an empty file.

Full methodology and raw outputs are in our audit workspace.

What axe-core found

Every single form triggered the same violation: region. Content not contained by landmark elements. Moderate impact. That is the landmark rule firing because we did not wrap anything in <main>, which is realistic — nobody prompts an AI for a contact form and gets a full page structure back.

Beyond that, all five forms got the same two “incomplete” results: landmark-one-main and page-has-heading-one. Axe could not determine these from static analysis, which is expected for a component snippet.

Here is what axe-core did NOT flag: two of the five forms had no <label> elements at all. The Bolt-style output and the Copilot-style output relied entirely on placeholder text as the accessible name for every input. Axe-core considers placeholder a valid accessible name, so the label rule passed on all five forms. Technically correct. Practically, a disaster for anyone who needs persistent field identification.

Generator by generator

Claude came back with proper <label for="name">Name</label> paired to id="name" on every field. A <form> wrapper. required on all inputs. Focus ring styles. No autocomplete attributes, which would have been nice for the email field, but the structural foundation was solid. Three label-for pairs, three inputs, everything wired.

ChatGPT-style output was the most thorough. Labels on every field, aria-required="true" alongside HTML required, an aria-describedby linking the submit button to helper text, and a heading. Honestly a bit over-engineered — aria-required is redundant when you already have the required attribute — but the intent was clearly to be accessible. The only thing it was missing was autocomplete.

v0-style generated clean shadcn-patterned markup. Labels with for/id pairs on all three fields, a <form> element, required attributes. No name attributes on the inputs, which means the form would not actually submit data to a server without JavaScript, but from an accessibility standpoint the label structure was correct.

Bolt-style is where things fell apart. The output was visually polished — gradient button, rounded corners, generous padding, the whole startup landing page aesthetic. But under the hood: no <form> element (it is a <div> containing more <div>s), no <label> elements, no name attributes, no required attributes. Every input relies on placeholder for identification. A screen reader user would encounter three unlabeled fields and a button. They could guess from the placeholder text what each field is for, but only before they start typing. After the first keystroke, the placeholder vanishes and the field is anonymous.

Copilot-style was minimal. Six lines of HTML. A <form> tag, three inputs, a button. Each input has a name and a placeholder, nothing else. No labels, no required, no ARIA. Functional for a sighted keyboard user who can read the placeholders. Opaque for anyone using assistive technology after entering the first character.

The pattern

Three generators got labels right. Two did not. The split maps cleanly to how each tool prioritizes output.

The tools that produced proper label markup are the ones that generate complete, tutorial-style code blocks. They have room in their output to include the <label> element, the for attribute, the id on the input. It is verbose and it is correct.

The tools that skipped labels are the ones optimized for speed and visual density. Bolt-style output is designed to look good in a preview. Copilot-style output is designed to fit inside an existing file with minimal vertical space. Both treat the form as a visual arrangement, not a semantic structure. Placeholder text is a visual cue, and visual cues are all these tools seem to be optimizing for.

This is not a binary “good AI vs bad AI” situation. The same underlying models can produce both outputs depending on the system prompt and the generation context. The pattern is about what gets prioritized when the output needs to be fast and compact.

The axe-core gap, again

We keep running into this. Our first AI code audit two weeks ago found that axe-core passes components that have every ARIA attribute but zero JavaScript behavior. This time we found that axe-core passes forms that have zero labels, because placeholder counts as an accessible name.

Both findings point to the same structural issue with automated testing. Axe-core checks whether an input HAS an accessible name. It does not check whether that name will PERSIST throughout the interaction. A placeholder is there when the field is empty. It is gone the moment you type. For a sighted user who can glance at the field above and remember what it was, this is fine. For someone who tabs back into a field after filling out the next one, there is no indication of what the field contains.

WCAG 2.2 Success Criterion 3.3.2 (Labels or Instructions) requires labels or instructions when content requires user input. The W3C technique H44 explicitly recommends using <label> elements with explicit for/id association. Placeholder text is not considered a sufficient technique. But axe-core is deliberately conservative in its label rule, because a placeholder does technically produce an accessible name in the browser’s accessibility tree.

The gap between “technically has a name” and “actually usable” is where real people get stuck.

What to check before you ship any AI-generated form

Run axe-core. It will catch the landmark issues and any egregious violations. But then do these three things yourself, because no scanner will do them for you.

Open the form in your browser, fill in every field, then tab backwards through all of them. Can you tell what each field is for without clearing it? If you cannot, neither can someone using a screen reader. This is the placeholder test. It takes fifteen seconds and catches the most common AI form failure.

Check for <label> elements in the source. Not visually — in the actual HTML. A <label> with a for attribute pointing to an input’s id is what screen readers use to announce field purpose. If the form uses placeholder-only, add labels. It is three lines of markup per field. The visual design does not have to change — you can visually hide labels with sr-only if the design calls for placeholder-only appearance.

Look for autocomplete attributes on name and email fields. This is not about accessibility scanners, it is about real-world usability. autocomplete="name" and autocomplete="email" let browsers and password managers fill in fields automatically. None of the five generators added these. Every one should have.

If you want someone to run this check across your whole site — not just one form, all of it — we do free audits that cover exactly this kind of gap between what scanners pass and what users experience.

What we are doing next

We are going to rerun this experiment monthly as AI coding tools update their models. The label gap we found today will probably close within a few model versions. The more interesting question is whether the placeholder-as-label pattern will persist in the tools optimized for speed, or whether they will shift toward semantic markup as accessibility awareness grows in the training data.

We are also going to extend this to other form types: login forms, checkout flows, multi-step wizards. Contact forms are the simplest case. The accessibility gaps widen as form complexity increases and the AI has to manage state, validation, and error messaging across multiple fields.