Falsehoods AI Believes About Names

2026-02-15

Names are a key aspect of one’s identity. Too many people have dealt with software systems where entering their name does not go quite right. A character is rejected. The order feels wrong. A name is truncated, reformatted, or “corrected” into something else entirely. While these may seem like small elements within the software, they can have big consequences for the person whose name is misrepresented. People with accented characters in their names often find systems rejecting them outright. Hyphenated names can be split incorrectly, creating mismatched records. Even simple truncation of long names can force repeated corrections on official documents or forms. For anyone affected, what seems like a small software quirk can quickly turn into a real headache.

Illustration of an AI like figure and a software engineer sitting together, with floating fields representing first and last name data around them.

This problem is not new. In 2010, Patrick McKenzie’s blog post titled “Falsehoods Programmers Believe About Names” describes 40 incorrect assumptions software engineers commonly make about names. They relate to various aspects of a name such as how many names could a person have, what characters can be used, when their names are assigned and more. Here is a small selection of them:

1, People have exactly one canonical full name.
6, People’s names fit within a certain defined amount of space.
7, People’s names do not change.
9, People’s names are written in ASCII.
32, People’s names are assigned at birth.

Each of these assumptions has some examples that prove them wrong. For assumption 1, a name can change at marriage among other occasions. For assumption 32, there are cultures such as Iceland where a name is not assigned at birth but within 6 months. See this blog post by Tony Rodgers for more examples that disprove these assumptions.

In an age where GenAI is at nearly every programmer's fingertips, one could expect that these assumptions are not likely to be made again. In fact when we ask the AI to create an API for names it should be likely it avoids most, if not all, of these pitfalls. We could test this by asking an answer for a prompt such as the following:

Design a general-purpose API for storing and manipulating human names.

It gives an answer, containing the following data model:

{
  "id": "uuid",
  "given_names": ["John", "Michael"],
  "family_names": ["Doe"],
  "prefix": "Dr.",
  "suffix": "Jr.",
  "preferred_name": "Johnny",
  "nicknames": ["Jack"],
  "display_name": "Dr. John Michael Doe Jr.",
  "locale": "en-US",
  "metadata": {
    "gender_hint": "male",
    "transliterations": {
      "jp": "ジョン・マイケル・ドー"
    }
  }
}

This gets a surprising amount of things right. The model supports multiple given and family names, recognizes prefixes and suffixes, allows for nicknames and preferred names, and even accounts for locale-specific formatting and transliterations. Some assumptions still sneak in: using fields like given_names and family_names implies a “first and last” structure, which does not apply universally.

Even with a very short, simple prompt, the AI can avoid many common pitfalls identified in McKenzie’s essay. It demonstrates that when names are the explicit focus, generative AI can synthesize decades of lessons from software design and produce a thoughtful, flexible API.

So now we’re good, right? With AI handling names this well, it seems like the old mistakes that software engineers made might finally be behind us. Surely, AI can just get it right, every time.

Unfortunately, things get less rosy when names are only an implicit part of the prompt or the system is not focused on them. When the AI has to reason about names as just one piece of a larger design, its attention to detail can slip, and many of the same assumptions that humans make quietly return.

To show this process let's ask the following prompt:

Design an API for an application where users can talk about books they have read.

Each user has their own account, with their name, personal details, and a history of books they have read and discussed.

The API should support typical operations such as user registration, profile management, posting comments, and viewing discussions.

Return the API design with data models and example endpoints.

This will give us, among other things, a data model as follows.

{
  "id": "uuid",
  "username": "string",
  "email": "string",
  "displayName": "string",
  "bio": "string",
  "avatarUrl": "string",
  "createdAt": "datetime",
  "updatedAt": "datetime"
}

The resulting data model has quite a few differences from the previous example. The name representation here is a lot more simplified. It has fields like displayName, username for this purpose. Notice that there is no separate handling for multiple given or family names, prefixes, suffixes, nicknames, or locale-specific formatting. Sometimes this is not always a bad thing, having no separate given and family names, prevents cultural and local specific assumptions about names creep into the representation. Nonetheless, it is clear that when names are not the main focus, the model tends to abstract a lot more.

While these are very basic examples, they highlight how much the choice of prompt can shape the structure of the code, even with the myriad examples and domain knowledge an AI model has at its fingertips. Of course, any potential design issues could be resolved by further prompting or refinement, but this also underscores why human judgment remains important. A knowledgeable software engineer in the loop can push back, ask the right questions, and name the issues as they arise, ensuring the software is built with the right abstractions.

Falsehoods AI Believes About Names

Illustration of an AI like figure and a software engineer sitting together, with floating fields representing first and last name data around them.