I have a suffix (the Roman numeral Ⅳ), and it causes all sorts of problems. Some sites will have me "prove" that I'm me by asking questions about "my" credit history, and very often I'll get my father's. Half the time I've already supplied a SSN… which makes it even more appalling that they can't get this right.
I've also been issued a driver's license for a "4TH". I have no idea how TSA would ever spot a fake. (Since they don't flag me! But I'm also in a demographic that tends to get passes in the security theatre…)
My wife is from Myanmar, where most people only have one name (no first, last, etc.) and has experienced endless frustration since immigrating here to the US. It's been quite difficult for her and even limiting in some ways and she's broken down in tears more than once.
Some places just put a few letters into each field (like say for the name Jessica, first: Jes, middle: si, last: ca, or something like that). The DMV did that, and then listed her name on her license as <last>\<first> <middle initial>. Others have insisted on putting "nosurname" in the last name field. The immigration people put "FNU" as first name, and her given name in the last. At some places she's put her name twice, once in the first, and once in the last.
Not anything close to what she's had to endure, but my name doesn't fit the standard mold either. I prefer to use my middle name and really dislike using my first name (shared whith my dad, who I don't have a good relationship with). I'm endlessly having to explain every single stinking time I interact with pretty much anyone.
Anyway, the take away is, please please please (!!!) don't make assumptions about people's names! Ideally just one field labeled "name", and let the user interpret that as they see fit. If you need to collect a legal name then you need to validate it anyway. If you really must do first name / last name then at least make the last optional and also include a field for "what should we call you" or "nick name" or something.
> I prefer to use my middle name and really dislike using my first name (shared with my dad, who I don't have a good relationship with). I'm endlessly having to explain every single stinking time I interact with pretty much anyone.
This sounds dramatically overstated. Going by your middle name is normal and doesn't require much in the way of explanations. For example, my father goes by his middle name. This has led to "problems" all of one time -- when he worked for the military, they insisted on the first name. So, during that period, he used the first name.
It really depends on where. Some places you'll have no trouble and others you will. Whether or not you'll have problems depends on what you want to do.
I've have the opposite problem in Japan. Virtually everywhere insists that I use the name on my (Canadian) passport as my official name. It's listed as "Lastname Middlename Firstname". Some government offices can handle 3 names (Yay). Some government offices understand the order for Canadian names and since they often can only handle 2 names, will record my name as "Lastname Firstname" (in Japan, family name comes first). Other offices don't understand the order and assume that my single given name has a space in it. Since they can't handle spaces in their software they list my name as "Lastname Middlename". No amount of explanation will detract them as they have to do it the way they've been told.
So now I've got 3 official versions of my name in Japanese official databases (4 if you count the version that is truncated because my name has too many characters). Luckily none of their systems talk to each other ;-) -- though I had one heck of a time getting my "My Number" (similar to social security number) registered because of the confusion. I feel sorry for any Portuguese people (who often have a lot of names) who live in Japan ;-)
The situation in the US is not nearly so bad, but I've definitely heard of problems before.
I know it's a minor annoyance, but it's something that jades me every single day and gets really old after a while. I feel companies and websites should do a better job of being respectful. A "what should we call you" field is simple and easy, would make the service much more friendly to some people.
So do you dutifully put your real first name into all forms? I know people who always go by shortened versions of their name (e.g. Rob instead of Robert). They just put in their preferred name as their first name for everything (uni, bills, etc.)
You can't blame companies for using the name they give you. Are you worried about being accused of fraud or something?
It's not something Burmese women are used to doing, and if you think about it it's kind of a weird holdover from a much more patriarchal time. But, if she'd known how difficult it would be I think she would have anyway.
Tons of immigrants at Ellis Island did not have last names, and were assigned them on the spot, or made them up on the spot. Sometimes with humorous results.
I imagine it would be extremely difficult to operate in the US without a last name. She doesn't have to adopt yours obviously, but she could pick one.
For example Osama Bin Laden. Bin is not his middle name, and Laden is not his last name. It means Osama son of Laden. But his relatives in the US use "BinLaden" as if it were a last name.
Quite the opposite, IMNSHO: "I don't like people using + in their emails, ban that! Everyone should have an address [a-z]{3,}@[a-z]{3,}\.[a-z]{2,4} - anything else is HERESY! You gotta make accommodations, else you'll break my overly restrictive assumptions."
You mean like when you move to a country and are oppressed into having to write addresses on envelopes in the way the local postal service expects it?
If email systems were a national thing, and I'd move somewhere where they didn't accept "+" in email addresses for, say, official government communications, then .. yeah, I'd have to get a new email address.
Personally I find it much more annoying if people mispronounce my name (which happens in a personal face-to-face setting) than if I had to write my name down in a certain way on a form, to make it work with the system. It's just an identifier for the system. I wouldn't get the same SSN or phone number either--which is just about as silly to expect.
It would be nice if they had standard ways of dealing with names that do not fit in such a system, so that at least the variant to make it work would be the same everywhere else. But to demand it to be taken verbatim and work correctly in the system, that's like moving to another country and demanding you keep the same land line number.
If there was a specific postal service for each building, each with their own set of requirements, you mean. And each, of course, insists that theirs is the One True Way. Yeah, that's a nice world to live in: change yourself ten ways from Sunday to fit into various, mutually incompatible and completely arbitrary rules; all so that the original developer making the rules up on the spot would have an easier job.
Presumably, if she wanted to change her name to avoid these problems she'd have done that independently. Most people would not want to change their name because of badly programmed systems.
Also, many people (and many cultures) consider adopting your husband's surname to be weird.
I have the suffix III which often gets written iii, and sometimes get attached to my last name so they just say it and then and accented "ee" sound on the end. There have even been other variations, but those have all been one-off exceptions.
Even at the government level, identity isn't solved. I have two government issued IDs, both current, that contradict each other and once barred me from getting on a flight. I also have to visit the IRS in person regularly to confirm my identity. I had multiple issues with standardized tests when I was younger, including one instance where they both wanted to delete my score because they thought I couldn't fill out my name correctly and they also wanted to give me an award (which ultimately made me a National Merit Scholar) - the best part is because of the confusion on my name they had a hard time figuring out whether or not they were supposed to give it to me (as in they thought I both did and did not exist in their system due to the naming error).
The above are just a handful of examples, there are many more and those aren't even necessarily the most extreme. The only thing I know for certain is my children will not share my name, or have any weird shit attached to it.
I have a hyphen in my last name that caused the California DMV to make one of my last names a middle name, and the Social Security Administration can’t verify my name on their website also due to it.
I feel like it’s time software needs to level up, ok 30 years ago sure mistakes were made, but now if you live on planet earth you have to know how names work after how many thousands of years our current systems have been in place.
part of the issue is people writing software making decisions about 'how names work', and there being multiple interpretations of it. I've wanted - many times - to build in just 'name' in systems, vs "first name", "last name", "middle name", "suffix", etc. Because... inevitably, clients have to support someone that doesn't fit that mold. The end user probably has dealt with it dozens of times already, but it's still bad for them, and usually unnecessary. MOST of the time, we only ever take "first" and "last" and concat them on the screen anyway, then keep them separated for someone to sort via excel...
It's actually interesting that frameworks provided by platforms such as .NET, Java, etc., don't include an abstraction for the representation of names.
Such abstractions exist for dates, times, calendars, currencies, calculations with money, and so forth, but not names.
On the one hand I can understand it, because names are so complicated, and how would you sit down and come up with something good enough to represent all of them?
On the other hand they're prevalent in such a high percentage of line of business and consumer facing apps that it's almost ridiculous that every single developer on the face of the earth at one time or another has to come up with their own half-baked implementation.
It's especially ridiculous when you consider that so many of these home-rolled implementations, if not all of them, are rife with terrible flaws that constantly cause frustration and inconvenience to a small but significant number of users.
This is a solved problem from a modeling standpoint. The HL7 Reference Information Model allows any entity (such as a person) to have multiple names. Each name can be tagged with a type (legal, maiden, alias, etc) and validity date range. A name can contain multiple parts in any order, optionally tagged as prefix / suffix / family / given. Names can also be explicitly marked as null if unknown or not assigned. There are open source RIM implementations in several languages.
Interesting; I didn't think I'd see an HL7 reference in this thread. I work somewhat with FHIR, which also has a HumanName[1] data type, and I think it handles most of the cases in this thread.
For those not familiar: FHIR is a standard that covers health and patient data. IMO, it's a pretty good model. (HL7 is the organization, and there are few other standards under it.)
I'm less familiar with RIM; could you link to it's definition of a name? (The best I could find suggested that it was nothing more than an unconstrained piece of text.)
Unfortunately due to the way the HL7 web site is structured there's no way to give a direct link. Go to the Normative Edition, Foundation, Data Types, Basic Types.
The FHIR data model is a little simpler to allow for easier implementation. In the vast majority of real world healthcare use cases it works well. But from a modeling standpoint if you need to cover odd edge cases it sometimes helps to look back at the old RIM.
What's the purpose of structured modelling of names? Why does it matter which parts are family vs given vs whatever else? Lots of the W3C International Examples they give use a `<text>` field. Why not just use that?
Is this a current use case, or did that just make sense 100 years ago? (As in these assumptions: "tracking actual relationships in data is Hard, and family name correlates strongly with real-world relationships") I mean, it sort of works even today, but neither of the assumptions are as strong any more.
A dedicated, well-maintained name abstraction is certainly something that needs to happen. More than interesting, it's a bit bizarre that this hasn't been done yet (AFAIK.)
In terms of developer-facing complexity, this could be a laughably simple thing to use- just a type that supports equality, perhaps ordering, and conversion to string. Only the constructor would need to be complex. :)
I guess the reason this hasn't been done is simply that the implementation would never be "correct"- there is no formal specification of human names out there, and there would always be cases where some poor individual with an unusual name falls afoul of the system. Strictly fewer cases than we have today, where everyone rolls their own name system, but still some; it's not a solvable problem the way timezone conversion is.
But, on the other hand, that's the way the world is- messy. Developers are going to have to learn better best practices for communally doing our best in cases where there is no perfect answer, because there's only going to be more of such cases as tech continues to eat the world.
I used ML techniques to help smooth over some of the difficult parts (there are many difficult parts). The hardest cases are ambiguous names, for instance delineating Hispanic vs. Puerto Rican naming conventions (they're different). The fundamental approach involved pushing all ambiguity up to the end user, so they always have the option to correct the system.
I’m pretty sure it’s solvable, the main problem is that we break up names into first and last to identify the parts and we do bad data quality checks. Let’s say we did just have one field and a service that was trained on each counties’s variations that could return the parts of the name you wanted. So some database and detection system to understand the pattern. It’s definitely possible since we humans do read and understand names just fine in our own locales.
I'm not convinced it is solvable. I don't think the general case is reliably solved even by humans.
E.g. is Carlos the same person as Karl?
Well, that depends. Was one of them localized, or are these the actual given names. Just this weekend I was on an offsite and saw a Spanish book about Karl Marx, or Carlos Marx as they had written it in the title, in the library of the house we stayed in.
Clearly in this instance the names are the same, but that requires knowing that Carlos Marx maps to Karl Marx and that Karl Marx is a famous name; otherwise you can't assume the name was translated.
There were many other names on books in that library. I don't know which one of them - if any - also maps to someone known under another name, because that requires me to know which person they are about.
Is Curt, Curth, Kurt the same person? My uncle had all three on different documents, and delighted in telling people about it.
What about countries like the UK, where there is no legal requirement to notify anyone of a change of name, and where a the legal way of formally changing your name - a "deed poll" is just a document structured in a certain way where you assert that you are known under a certain name? My ex is known under at least three different name combinations, all of which are present on different sets of legal documents.
Some subsets of the issue is solvable, but for example there is no way of taking a full name and returning the "name this person prefers to be known by" because the name does not contain that information. You can make a pretty good guess.
But you'll fail dramatically for people from different countries. And don't think for a second you can guess correctly based on where a name is from - many names are used in different countries, and often as different elements (e.g. firstname one country, lastname another; feminine name one place, masculine another), and many people have names that combine different nationalities (e.g. my son has a name that combined an English firstname, a Nigerian middle name and a Norwegian last name).
The only reliable solution is to not assume any one single string can be used as a generic name - you need to ask what to use within a given context and within whatever constraints you have.
I've in the past written stuff that generated an index of names; this was sorted. While you can certainly sort on a free-form text field, the culture here is that name indexes are generally sorted by family name. So, to do that, you have to have some understanding of what the family name is, which a free-form text field does not give you.
But a lot of software has no need for the breakdown, and would be better served by a free-form field.
Then at least be upfront with the user about why you're asking, because they might well answer differently (e.g. include a different number of parts of their last name) if they know your purpose is to sort the name than if they think it's being used for a different purpose. They might even give totally different names.
That's the most important part of the comment above: The concept of a name is so overloaded that unless you ask about the string to use for the specific purposes you intend to use it, then there is very little you can do with it.
Which is great until the client asks for a friendly name (given name) identifier. GitHub uses name, we use first/last name. So we just shove the GitHub name in the first name spot and ask people to organize what looks right.
Our partners suggested string.split(' '), which produced interesting results against the sample list of github users.
Use two fields: name and display name. Anything else will break for someone. In most cases where something asks for my name its really not even necessary and certainly not necessary to split it into first/last/display/friendly etc.
This is not sufficient if you’re going to localize to languages other than English. In some languages, proper names get declined like other nouns and thus change spelling in different contexts.
Are there examples where splitting it in First/Last name (or any other split) would help with that? It seems like that would always either be a problem or something to design around while localizing.
It’s fundamentally language specific, so any comprehensive solution is going to need to interface with the localization system. Really, names should be keyed by (user, language, tag) triples, where the localization defines the acceptable tags based on language requirements. For example, a single person may need their name stored as:
en:disp Eric
is:disp:nf Eiríkur
is:disp:þf Eirík
is:disp:þgf Eiríki
is:disp:ef Eiríks
Designing a UI to collect this information is left as an exercise for the reader.
Since asking a user to enter every possible variant is unwieldy and without doing so, the problem is unsolvable, I’ll stick with my original suggestion of using a single field. Yes, some names will break, but far fewer than if first and last names are required.
Eh, I understand it. They want to put some name in the identity control in the header. Putting the full name in is guaranteed to go wrong. We might start asking for a nickname for that purpose.
Oh yeah that touches a nerve. I hate it when I have filled in my name and email somewhere, maybe I forgot doing it, and I receive a newsletter, which opens with and greets me with just my first name.
No. Just because I signed up for your mailinglist doesn't mean we're on a first name basis! I hardly even remember your business exists, do you remember me? Anything else about me except my email and my full name, from which you deduced my first name? Then let's not pretend we're buddies.
This of course depends on how "friendly" I'm willing to be with said business. Which differs if it's an Etsy store, ordering food online, my bank, insurance, etc. I especially hate it when the news letter is in fact 99% ads and promo babble, but has this 1% of useful info that I want to be kept up to date on. We're not close, I'm letting you spam my inbox, call me "your grace" or something.
Can you actually go wrong with just using someone's full name, and erring on the side of being a tad too formal? Is this just a problem with marketing companies that want to "connect" and become "buddies"?
Which isn’t really a problem you can solve. If names change in the context that they’re used, it will always be broken, so why break more names by trying to be clever?
A single field breaks fewer names than forced first and last names, though, and is a simpler implementation too. Plus, as long as you accept any input (besides blank, I guess), then the only way it will be broken is during display and at least the user sees the exact name they typed in, exactly how they typed it in.
A single field also breaks the expectation that people do not get called by their full names in every interaction. This is a very common expectation, and violating it makes you sound subtly more like an evil robot.
Is this a lesser offense than mangling a name that doesn't cleanly split into first/last? At the individual scale, probably.
The impact, in aggregate, on UX/sales/utility? Could definitely go either way depending on your userbase.
Actually I think it should be ok, because you’d enter the name in the nomative case and then when you write it on the screen you’d declinate it based on the language you’re displaying it in, which would be the same for every name regardless of its origin.
Tried that once. Horrible, terrible, no good idea. (The only rule you can be sure of is "there are countless exceptions, and exceptions from the exceptions", everything else is a minefield in a quicksand) Asking for "how should we address you" is far easier, even if a few users fill in "Your Galactic Imperial Majesty".
Usually the name would change according to the full rules of noun inflection in whatever language. In Latin, a noun has 6 cases, of which vocative (indicating direct address to the noun) is one.
Because it’s extremely diverse between cultures and countries how names work. Here in Germany it’s typical to have several first names and it’s legal to use any of it, even though it might be just the name of your godfather/godmother.
The idea that a person's name should be parsed and managed by software is amusing to me. How about just getting rid of concepts like "last name" and "first name" (which already embed a lot of cultural assumptions), and only ask for a "full name"? In some countries people don't have both first and last names. In some countries the last name customarily comes before the first name. In some countries the structure of names is more complicated and the son's name includes a copy of his father's name. I don't think software will really handle all these oddities correctly, given that just a single parent can undermine all the system's rules by choosing an unconventional name for their children.
For what it's worth, in Singapore, where there are significant Indian, Chinese, Malay ethnicities but also highly westernized, the government identity card provides just a single full name. Parents can choose their children's names in accordance with their culture—or not. You can put your first name before your last name, after it, or surrounding it. Or include your father's name if needed.
By ignoring the structure in data acquisition phase, you just postpone decisions about structure to data processing phase, now without necessary information about the structure (which could be obtained in data acquisition phase).
For example, such basic functionality like changing sort order between given name and surname would be much more complicated.
Besides, the whole story is about the problem that name including title was in one field and later processing (title removal) misbehaves due to insufficient information.
It sounds like all you really need to handle names reliably is to ask for the entire name in one field, then have another field for their preferred name (which could be the first name, or the middle name, and a diminutive). And if you need to do something more formal with a title (like Mrs Lastname), potentially have that as a third field.
Sometimes the dumb solution is better than trying to be clever, and it saves some trouble with localisation.
> Sometimes the dumb solution is better than trying to be clever
It's astonishing how often this turns out to be true, which has been probably the single most important lesson of my career. I think it's that clever solutions tend to depend on more assumptions, which rarely have P(true) = 1.
Also hard to decide which is the surname to use with some names/cultures.
My native country, Norway, went through an assimilation period of standardising surnames a few hundred years ago. Before that your name often was in 3 parts:
First names(s) - father's name - farm/manor/village.
So names were something like "Ivar Ragnarsson of Torp" or "Sverre Haraldson Bjerkeli". (With the -son bit to say whether a son or daughter).
With assimilation into standard more Continental Christian Danish society and most likely standard registration for tax - people dropped either the farm name or the father's name in their names. And froze the father's name in the surname in future generations. And changed the -son to a more Danish -sen for all genders. So, since the 1700s people have just 2 parts to their names. Unlike Iceland which has kept the naming tradition.
However,... what is common again today is to have 2 surnames. One from each parent. Unhyphenated. Similar to the Spanish convention (first-name - father's surname - mother's surname) but not as standardised, and mostly opposite order with father's surname at the end being the official family surname. And that makes internationalised computer systems so complicated.
My children have both our surnames, both by choice and necessity so either of us can get through passport control with them. (mother's surname - father's surname). But they had to have their surnames hyphenated to be able to register their births and British passports. Which still angers me today as my family convention of the latter surname being the main one is now mostly ignored.
People are free to choose what they want but you mainly keep only the "main" surname from each parent.
I know in England there was a tendency of people keeping both names of powerful families[1][2], then as double-barreled surnames. Which then sometimes went a bit nuts a few generations later if they married into other double-barreled families [3].
I think it was when I visited Stowe School, the seat of the Dukes of Buckingham and looked at the family tree, that I even saw some surnames repeated if they married into other families which shared one of their multi-barreled surnames...
I don't know about the OP, but in Quebec this is fairly common. Usually you have 2 surnames until you turn 18 and then you choose one of them. I really like the idea of this, personally.
Myanmar is another one, almost everyone has just one given name unless they are specifically following the western or some other style (and that's rare in my experience). The last name field should always be optional at a minimum.
> "First" and "last" is a wrong representation anyway, because it assumes people always write their names in that order.
I do not see this as a problem. If i know English so i can fill english-labeled form on english webpage, i would also have a bit of cultural knowledge to translate first name to given name and last name to surname.
> But this has it own issues, like assuming people have either a given name or a surname in the first place.
This is only a problem if the form validates that both fields must be non-NULL. Problem is not with the split itself but with the validation code.
Worked on some business software previously, and customers insisted on first/last or first/middle/last, despite the fairly obvious issues. They also demanded address fields in a US style despite needing to support international addresses (I still have no idea how their staff handled that).
People want to follow the conventions they know, even apparently if they're told it will cause issues.
Yet in reality, nobody actually needs a name that's sorted by surname: that's a holdover from paper phone books. We have search, and we have stable sorting algos. Every requirement "sort by surname" I've ever seen turned out to mean "sort the names in a predictable way, btw this is the way we always did that, because we always did that."
(Yes, familiarity is a part of UX; but do note that this one specifically is a historical, not intrinsic, motivation)
We used to have problems with non-ASCII chars in names, we fixed that with UTF, we had problems with currencies and numbers, we made libraries that understand locales and even directions of writing, time zones same thing. So it’s time for us to resolve names now with standard libraries that have been thought through like the above.
While I'd probably run in to edge cases, it would be nice to actually point to a standard and say "the libraries all support standard XYZ. that's built in - doing it any other way is going to mean problems ABC and cost $$".
Credit cards forms handle this perfectly. There is just a name field. It seems odd that this is perfectly acceptable for financial companies that in essence loan billions a year but its not ok for anyone else.
I have a similar situation, it's so frustrating. I couldn't buy insurance from Progressive because Lexis Nexis thinks I'm my father and it was too much of a hassle to resolve. Simply because I'm a Jr and we lived at the same address (obviously) for about 18 years.
I still have problems with this because I have the same name as my dad. The worst was when my bank account was locked because his boss at the time had been taking taxes off their cheques but not paying the government. For some reason they called my bank and locked my account thinking I was him(at the time they thought he wasn't paying taxes, it was all worked out eventually) despite our birth dates being nearly 30 years apart.
It happened without any explanation. I had to go to the bank and ask why I couldn't use my debit card. They explained.the government had a hold on my account. I ended up having to prove I was not my dad with a bunch of pieces of ID. It took like half a day to get access to my account restored. I was 20 or something at the time. There was no way I could me mistaken for someone in his 40's.
>There was no way I could me mistaken for someone in his 40's.
But when designing automated systems, you can't add in a condition of "unless person doesn't look their age". If one can't use names as a unique identifier, then that leaves a number issued at birth by a central government. But as I understand, even SSNs aren't unique and get re-used.
As an aside though, cultures that use the same name for multiple people in the same family confuse me. To me, the purpose of a name is to identify, so what is the point of naming someone the same? Perhaps, historically, it was a way to establish credibility before the time of credit reports and phones.
SSNs are not (yet) reused. There's ~900 million potential SSNs, we've run through half, and are using ~5.5 million a year, which gives us at least another half a century before we have to start reusing.
In some Irish families (mostly older people now), people had a Gaelic derived name and a legal English name, because the authorities banished Gaelic names. When you do genealogy, it’s very difficult to track people in certain circumstances as spellings and references change.
That happens in areas of Brazil that had German immigrants. They had their German-style names, but at one point, Brazil forced them all to adopt Portuguese-style names.
Normally when I discover things like that which require some interaction with bureaucracy at some level, I find out on a Friday afternoon at 4pm, have to wait until Monday, then things are resolved for days because people are 'backed up' on Monday...
I actually wrote an API for handling personal names, because software mangling people's names irked my pedanticism. The fundamental takeaway is that names are ridiculously complex, equivalent to any other part of natural language. For every rule you could contrive there are exceptions, and many more legitimately ambiguous cases.
You shouldn't ever use first/last name fields, because they force users to adapt around your system (many names don't follow this structure). A long unstructured text fields is best, because it can accommodate nearly anyone who's name can be spelled with unicode. Finally always check your interpretation of a name with the person in question, seeing as they're the end authority.
I also have a roman numeral suffix and experience the exact same issues all the time, its insane! I understand what my parents were going for with the suffix but that tradition will end with me if I have kids.
The credit agencies also started attributing debt to my name from someone with a very similar name and social. Of course they don't like paying their loans and so every once in a while my credit tanks while I go through the process to get it cleared up. As far as I've been able to find there is no long term solution to this, I just have to deal with it every couple years.
My first name differs from my sibling's by two letters (and same last name), and I often get credit history "verification" questions about him. I mean, they're two very different names, and whatever BS company it is that's behind this is complete crap.
The people I bought my house from have the same last name as me. So every time I do one of those, it asks me questions about them. I know none of their information as I never met them.
My wife has the same maiden name as her mom, with a middle initial that is one letter apart.
This flags more corrective correlation, so despite having a different last name for 19 years, insurance companies and Bank of America get them confused. The insurance is worse because you cannot appeal insurance data.
I have a suffix as well (II), but I stopped using it years ago. I have one old credit card that has my full name including suffix on it that I have to remember to include the suffix with, but that's about it.
Maybe it would help to spell out II as "The Second" or IV as "The Fourth"? People with hyphenated or apostrophed last names could even spell out their punctuation too. But then how do you standardize on the spelling of punctuation?
FORTH solved the punctuation spelling problem by systematically documenting how every word (including punctuation) is pronounced, so you could unambiguously speak FORTH programs over the telephone.
I had a friend in college named jane smith, capitalized as so—that is, not capitalized. She always entered it this way and it always appeared to be correct until the next nightly database cleaning job ran and gave her initial capitals. Eventually the registrar told her that nothing could be done but that a note was made in her file to ensure her diploma was printed properly. Of course it wasn’t, but they did re-print it free of charge without complaint.
I also have a numeral. Sometimes as a child, when I went to the local hospital they would initially pull up my father's information instead of mine. I'm not sure how they thought that could be correct, given I was a child in the 2000s and my father was born in the 70s.
Amr Eladawy has personally contacted GDS but been ignored. Sadly, this is one of those situations where getting a lawyer to send a letter to them probably would work. A bug that affects a group of minorities with a mention of interfering with the US Constitution right to travel with a hint of possible class action will tend to fix this type of bug. I don't like bug-report-via-lawyer but it does work.
The GDS systems have worked like this forever, and likely this won't ever change. The GDS can claim the issue is on the travel agent side, who should have written the lastname as AMRMR.
Having said that it clearly sucks and is super confusing and annoying to affected people, especially given the horrendous prices some airlines demand to change the name on the booking etc.
It seems like a small thing and I'm hypothesizing but I'm pretty sure no one sane at time will risk changing this, deploying to prod and hoping that nothing breaks. There's hundreds of airlines and thousands of travel agents involved.
The rule about having a title after the name without any separator is clearly a bad design, but now you have decades of systems (and hacks) built on top of it, you can't just change it like that overnight. In the same way as you can't change how certain broken web APIs work because there's too many websites that rely on that broken behaviors.
Yet another, more local and fixable issue is the airline app which doesn't allow 1-letter last names. This is the same as the apps which helpfully validate email address and reject many valid emails. The best regex for validating emails is /@/. Anything else is probably broken.
--
Another big issue in the industry is that PNRs are 6-digit long and sequentially generated over time, which is a security problem, as demonstrated by CCC a few years ago. If you know someone's name and that they're flying on a given date with a given airline, you can try to brute-force-guess their PNR number and get their personal data or even change their reservation. Again, this is so rooted into the systems that it can't be changed without a massive industry collaboration - probably a years-long project with $$$ cost. For now the GDS have anti-bruteforce mechanisms in place, but it's not good enough solution IMO for a determined attacker.
> given the horrendous prices some airlines demand to change the name
In practice airlines are fairly lenient concerning misspellings, out of order names, missing diacritics, etc. The name change fee applies mostly to changing the person who is flying.
I have a one letter first name and Air New Zealand refused to book me with only one letter in my name and then equally refused to change it when it caused problems later matching my frequent flier mile account on another airline without charging the ridiculous name change fee, which the airline agent at the desk said the computer system would not allow her to waive.
I'm not talking about computer systems. It is trivial to write code to handle an arbitrary UTF-8 string. I mean for his own sake, it's probably best to have a real name.
Do you even realize how rude and offensive this suggestion sounds? For most people their name is quite a large part of their identity, it's not on you (or developers of the protocols in question for that matter) to decide what does or does not constitute a "real name".
The mere fact that you're reading about these cases here should be good enough of an indication that they're indeed not seeing it as a waste of time.
But how much time would then be wasted writing/typing out all of those extra letters in their name? Then, dealing with updating every account they have. Your sarcastic response wsan't even well thought out. Developer's time is expensive to waste in a gov't office to address a bug left from their parents.
I was not being sarcastic whatsoever. I know two people who have changed their name because they didn't like the name they were born with. My suggestion is 100% dead serious.
Right. The GDS systems doing anything other than pointing the finger at a third party (like, say, fixing the bug) might be seen as admitting responsibility. In fact, this is a good way to ensure that the bug never gets fixed.
If Sabre has enough money to send Bill Clinton and Mark Ronson to Hawaii for a glorified frat party [0], it has enough money to tear through all but the best-funded legal challenges. (You'd think I'm joking. Nope.)
Just because nutjobs like to try to invoke constitutional rights doesn't mean those rights don't exist and can't be helpfully invoked. The stereotypical nutjob-invoking-constitutional-rights move is to respond to every police encounter by asking, "am I being detained?", but sometimes you're better off asking that question.
He kind of has a point, though -- I agree the right to travel does not specify a specific mode of travel and should cover planes, but then pretty much everything that goes on w.r.t. planes and cars is unconstitutional.
(Consider that while the government definitely has the right to tax you for road use, requiring you provide an ID to prove you paid a tax is like requiring you to carry an ID proving you paid your federal income tax, or requiring an ID to exercise free speech.)
"Am I being detained" is level 1 nutjob stuff, fairly normal. Incorrectly invoking a "right to travel" is advanced nutjob. It will immediately make a lawyer think you're a sovereign citizen since that's a cornerstone of their legal philosophy.
Its from the "Privileges and Immunities Clause" of the US Constitution and has shown up in many court cases. What may seem "nutjob" to you might have a whole different meaning to the lawyer you have hired to protect your company. Which is why you really want to make sure the letters that you get from lawyers are ran by a lawyer even if they seem a bit off.
To be fair, we don't know if the problem is being ignored, or just isn't in the current sprint.
Think about how this would go on your team. Customer service sends in a ticket to your Product owner, complaining that a customer has a problem, and called in not only reporting it, but telling you exactly what their expected fix was. Even if the team acknowledged the problem and wanted to fix it, would it really be bumped to the #1 priority, and would the team really just take the fix from the customer instead of designing their own solution?
Or would it be triaged and prioritized, taking into account all other product and customers needs at the same time?
I'm not arguing that airline customer service is great. But a post on stackexchange venting about a customer service team's lack of communication gives us zero insights into what is really going on.
The post said that it had been working fine for 6 years, and this is a new, recent bug. Or, more specifically, that the problem's workaround is no longer valid recently because of the new bug that requires longer names.
These PNRs are stored on mainframes running software written in COBOL which has not been touched for years. The actual PNR is a flat text document with semi-structured lines as explained in the article.
We never make changes to the mainframes, we just wrap the functionality. The structure of a line cannot change as the industry relies on all airlines using the same structure. If a new "field" needs to be added to the PNR we add it as an "RM" (remark) entry.
I dunno about you, but I always liked the practice of having an on-call rotation (even if you weren't actually on call) so someone would be able to pick up minor bug fixes without them getting in the forever backlog.
My entire "legal" name is "Aaron." (And now you know what my HN name means.) I was born with the usual three names. When I was thirty, years before 9/11 and DHS, I got a court order and dropped the middle and last. Because, OK?
Never a problem, often a conversation. Sometimes I'd have to spend a little effort to be sure it said "Aaron" on my driver license. Other areas like credit cards and employment I've been more flexible, going by "A Aaron" or "Aaron Aaron," etc.
SS used to call me "Aaron." At some point they renamed me to "UNK Aaron."
All fine.
Coming on two years ago, as part of a "pivot," I got my CDL, Commercial Driver License.
Up to this point my DL said "Aaron." During training my learners permit said "Aaron."
When it came time to take the CDL road test, the CDL office, separate in some way from the DMV, would not schedule my test.
"Because there must be something in the first and last name fields on the driver license."
Without investigating, I'm quite sure that they populate some CDL office record from some DMV record, and their software was written assuming that there must be two names and assuming that the DMV records would all have at least two names. But not three or more, because the programmer or requirements writer had personal experience with people without a middle name.
So I had to go back to the DMV and get my driver license changed to something else.
It too a while, phone calls, "can't you just ..." etc.
The clerk finally agreed to list me as "Unknown Aaron." Which, note, is not my legal name, just what the DMV agreed to call me. So my legal name and "wallet" name are not the same.
Now the CDL office recognized me as "Unknown Aaron." Took my test, got my permanent CDL. Which says "Unknown Aaron."
Hired on with a company which knows me as "Unknown Aaron," because they have to use my CDL name because the feds know me that way now.
Which means my health and other insurance knows me that way.
I was tempted to go with a single name when I was forced to legally change my full name to meet ID requirements for a driver's license.
My birth certificate name is patterned like Andrew Bruce-Carlos Denis-Edward Fatherson. After my parents split, I had many variations. E.g. school knowing me as Andrew Charles Motherson, my bank as Andrew Bruce Motherson, etc.
Every official document was something different, so I had an official change to get the most common variants in a close-enough form that still fits on most forms, i.e. Andrew Bruce Carlos Motherson.
It still feels like a missed opportunity to go as simple as possible, and shed some family baggage.
Assuming what is and isn't valid for user inputs is a dangerous game because there are always exceptions.
I ran into a similar issue with many online retailers when I was living in the inner city of Mannheim, Germany because a lot of online systems make assumptions on how a valid address looks.
Addresses in Mannheim's inner city follow the format "Char Number, Number". "A1,1" is a valid address if you want to send a letter to the district court. A1 being the city block the court is located at and 1 being the house number within that block.
I didn't get to do a lot of online shopping for years when I lived there.
Simple workaround would be putting in a human-readable long string. Anything like, "Daniel's residence, A1,1". Post people know how to read.
Upd: reaction to this suggestion shows that some people don't understand how post office operates. They go to great lengths to understand where to deliver the mail/parcel. In most cases, addresses like "big yellow house with a red door overlooking the cliff near the lighthouse" would work. So the only challenge here is to get past the whatever dumb rule the service developer imposed on the address format. Likely it is just filter by string length.
Post people know how to read, but I think now most nail sorting/routing is done with computers and OCR. I sometimes get mail addressed to people who used to live at my address but have long since moved. I tried writing “return to sender, addressee not at this address” or similar things but the mail kept coming back to me. I finally went into the post office and they said that the machines would just rescan the address and send it right back to my address for delivery. So I think relying on postal employees to see/interpret things on address labels is no longer a viable approach in many places.
No, not really. Sorting is always manual when automation fails.
In your case, automation actually didn't fail, it just didn't recognize your additional instruction. Probably, you could have just patched the address with an easily removable piece of tape and that would definitely trigger a human attention, and delivery would go where it should
-My parents (living in rural Norway) once had a postcard delivered where the address given was simply their first names - no last name, no street, no town, no nothing.
Having a database in which every citizen's domicile is registered does have its occasional advantages.
Similarly, there was a story going round a few years back about mail being delivered in Iceland where instead of an address there was a map to the house to be delivered: http://i.imgur.com/1GVjLKF.jpg
I live in Japan. I once had a package delivered from overseas where their printers couldn’t print CJK fonts and thus the whole address resulted in just small empty boxes.
The post office inferred my address from the post code + my name and delivered it correctly. There wasn’t even a (noticeable) delay.
> Probably, you could have just patched the address with an easily removable piece of tape and that would definitely trigger a human attention, and delivery would go where it should
Fair point. And that was the advice given to me by the postal service.
Cross through the wrong address. Every year or two somebody from the management agency tidies the noticeboard for the building I live in and removes my hand written sign explaining how this works. Then, next September/October when lots of people move in (some fraction of the occupants are students) the noticeboard gets envelopes pinned to it with undeliverable mail. I write a fresh sign.
The sign is a flowchart, it says first, is this mail for a different address? If so, either redeliver it (duh) or write "Misdelivered" in bold leters and put it into any postbox.
If not, but you don't recognise the recipient, strike through the whole address in black pen and write clearly "Not at this address" then put it into the postbox.
This won't stop you getting more mail by the way, I still get letters labelled "Urgent" with the name of the previous owner years after I bought this place. But it does stop literally the same mail coming back since the OCR will reject the crossed out address -- it's just that the sender may not have any effective process for what to do when they get the mail back undeliverable.
Define 'accuracy' though. That's what this whole thread is about. When what you're comparing against is itself wrong, what does accuracy mean? Also, "big yellow house with a red door" etc. is in fact accurate for that place.
That's the whole issue: a random coder decides "and this is my idea what's acceptable: Google Maps/whatever finds it from the input string; worksforme, done!" without second thought or even authority to make such decision; this, an operational decision, gradually becomes doctrine, even dogma.
A friend of mine living there works around it by using „Quadrat A1 1“ (translated: square A1 1 - since most blocks are roughly a square in Mannheim) and it seems to work okay.
But the naming in Mannheim causes a lot of issues, I remember early navigation systems having a hard time with the format. A IIRC TomTom even crashed when trying to announce the street.
It's always annoyed me that the blog post above didn't contain examples, and I'm grateful that someone posted a latter post that does contain examples:
This entire thread is a testament to Falsehoods Programmers Believe About Names. Shouldn't everyone by now have access to a list of "edge case" names to test their software against so these kinds of things don't get deployed? It's hard to believe it's 2019 and we are still struggling with things like names, addresses and dates.
The edge cases in this example are trivial to generate: length 1, or empty string. ("Ends with the letters MR" isn't on this list, and is awfully specific.) If you do nothing but have a text field that can contain anything at all the user chooses to type, that will work!
Add in "user X's name changes to Y on date Z" (hard to put in a list of edge case names) and you've covered 98% of these issues.
As a person with one name and a programming background, I've always put my actual name in the last name field, assuming that that would usually be the most significant sorting field.
Also don't assume these requirements come from developers. I've built form fields countless times and explained these rules to countless stakeholder and they always push back with "no, street address is a number" or "no, phone numbers must always conform to this format" and "no, everybody must have two names because I need to be able to sort the list by surname".
And then the system launches and the complaints start to roll in to support.
Also don't assume people can't have the same first and last name (Norwegian Air...). When their website failed to book my ticket, a customer rep told me to put NAMEMR as my first name. This then led to my ticket showing NAME MR MR. Surprisingly nobody batted an eye on international travel day, but it annoys me that their website product team decided to take it on themselves to take what I assume were Norwegian naming norms and improve on lax GDS constraints.
I have an apostrophe in my name and it causes allll sorts of issues like this. (Think Bobby Tables). To the point where I’m pretty convinced that the internet is going to wipe out apostrophes from people’s actual names. In fact I just omitted it in my most recent drivers license.
Ouch. I have a ü in my last name from family in a different country way back somewhere. It doesn't crash systems as much as an apostrophe would, but it's very good at showing encoding issues between systems..
It's not as big an issue as it used to be, at least. Before I've had online transactions failing because of a mismatch between my name (with ü), and the name on the card (with u). The systems seem more forgiving now, having handled that case or something. I also remember being a bit scared traveling to Japan many years ago, as we were told it was SOoo important that the names and everything matched to gain entry. And then the name on my ticket was completely mangled. But no one cared.
The Japanese are quite used to mojibake [1], so they would've understood immediately that the mismatch between your ticket and passport was caused by encoding issues.
Interestingly, I've had problems in Korea (Gimpo Airport) because my name contains an "ö", and the canonical spelling in the passport for this is "oe". This was cause for much confusion among the airport staff.
I would have thought that people from CJK-countries were more understanding of encoding-to-latin weirdness than most, but apparently not.
I think their understanding would be focused on the encoding for their language and a relatively narrow set of problems. I've encountered name issues in CJK countries that keep names in native encoding due to an assumption that full names fit within a couple of characters with no need for any spaces or punctuation. Some systems might be designed to be "accommodating" and take even up to 8 or 10 characters! There was one train system where my name had at least four different iterations through the tickets I collected, with different ordering of first and last names and truncating.
In defense of the Korean airport staff, they might have been more accommodating if the "ö" was completely and obviously broken, like "£‡�". Spelling it as "oe" makes it look like there are no encoding issues, in which case strict checking makes more sense.
It's much easier to identify mojibake (they tend to be extremely obvious in CJK encodings) than to remember canonical spellings and other variations in a whole bunch of different languages. Airport staff probably know that "oe" and "œ" are interchangeable, but that's about it.
Diacritics are usually stripped in air travel. In Hungarian we have many letters with diacritics, but it is never a problem that the passport has them and the system doesn't.
Not in all cases. In Germany and Finland (maybe all EU passports???) ä is spelled ae, ö is spelled oe in the machine readable part (umlauts shown in the "human-readable" part). This is important to know when you need a visa.
For Germans this is not a big problem because it has been like this forever if the umlaut is not available for technical reasons. For Finns this is a problem, because this "transcription" is completely unknown in Finnish. For a couple of weeks now it has been possible to get an electronic visa for Russia on the internet. Reportedly many Finns with an ä in their name (that's not uncommon) dropped the dots when applying for their visa, because an ä is not accepted. At the border they were not allowed to enter, because the machine-readable part of the passport has ae instead.
There is an ICAO recommendation. However, it is not unambiguous and of course it's not legally binding. So in the end every country decides what they do. (Possibly there are more multinational agreements e. g. inside EU, but I doubt there is anything truly worldwide.)
For German names, this is a problem. I have an ü in my name and this is transcribed as a "ue" in my passport. Transcribing it as u would produce a different name (which AFAIK actually exists).
In Hungarian the diacritics are also important, for example Szilasi and Szilási are different and are pronounced differently. Still, it won't be an issue when flying or other stuff.
German is more complicated though with all the substitution rules.
Not to mention Germans who actually have an ue in their name, still pronounced as ü, but written as ue only, never as ü. Or someone may be called Gross, but it would be incorrect to write it as Groß, while someone else's name may be Groß with the acceptable alternative spelling Gross when ß is unavailable.
I too have an apostrophe in my name and experience the same thing. I've had people put it into their system as a comma, dash, space, and all sorts of weirdness despite my calling it out specifically.
My experience has actually improved substantially in the last 10 years or so, and most of the government systems I encounter these days actually handle it properly (as well as handling suffix properly too). That said, I've somewhat recently started having trouble checking in for flights again -- I flew last month and it took the ticketing agent >20 minutes to find my reservation on both the outbound and return flights, even despite my providing the 'confirmation code' / itinerary email (we were checking bags & flying with infants, else I'd have done online check-in).
It can be really frustrating -- though I'm hopeful it will continue improving and hopefully be a smoother experience by the time my kids are adults.
> I've had people put it into their system as a comma, dash, space, and all sorts of weirdness despite my calling it out specifically.
Ugh, yes. And it's insane how many people seem to just NOT KNOW what an apostrophe is.
> checking in for flights
Yea, airlines seem to be one of the worst offenders. I have Precheck but Spirit in particular is never able to match the name on my ticket to the name in the gov't database so I never get it. Just one more reason to avoid flying them I guess.
Out of curiosity, why not just omit the apostrophe for airline reservations then? I understand wanting your full, real name in many circumstances, but who cares about what the boarding pass says as long as you get to fly? I doubt the people checking ID would care about the missing apostrophe.
I often did do this when I used to fly more often domestically, but it tends to cause other issues -- the primary one being a "frequent-flyer/mileage account name mismatch" which means that I have to undertake some manual process to collect my miles. I've lost out on countless 'airline miles' as a result via forgetting to do the manual process within N days after the trip.
Similarly, automated check-in kiosks are then usually unable to find the reservation via credit-card or passport scan -- meaning you're back to looking up the reservation code, and even that often fails, as if the apostrophe just flat-out causes issues with the query/lookup or something.
It can be very frustrating, and I'm increasingly often impressed (and vocalize the same) when I spell my name and the agent enters it correctly AND the system flawlessly handles it, too! The DMV systems in my state are one such example where I used to have issues but, in recent years, the problem appears to have been wholly addressed/handled.
Practically speaking, that makes sense. Philosophically, it's abhorrent. Blaming the user is bad behavior in general. Expecting a person to alter their name to confirm to a poor software implementation is just wrong.
Well, people with names that are not written with Latin script are coerced into whatever Latin transliteration their government uses when issuing passports. Bonus points for altering the transliteration rules from time to time.
But ID checking is also done electronically at some checkpoints. If your ticket doesn't match your passport, your Visa, your Visa waiver, etc. you are going to be in trouble.
That being said last time I went to the US the person booking the ticket swapped my first name and last name. Only the person at the baggage dropoff noticed it, and after much deliberation they suggested to leave it that way. I went through with no issues apart from not being able to register the mileage.
I have a hyphen in my first name that also causes problems. I love it when I put my name in and the web site say "invalid first name." Thanks mom and dad...
What is worse is people who "fix" my name by moving the second half of my first name and making it part of my last name. I'm an adult. I know what my name is.
In Quebec, composite first names (prénoms composés) like Marc-Antoine are pretty common, so there was nothing weird about my parents giving me such a name. And frankly, most webforms I had to fill out while living there accepted my name just fine.
However, now that I've moved to the United States, it's been a bit of an annoyance.
I have a double first name, so I have a space in my first name. Many people / systems seem to think I accidentally put my middle name in the first name field and helpfully move or drop the second part. Putting a hyphen in (which is not really supposed to be there) typically fixes it, so I'm variously known with a hyphenated and non-hyphenated name. But it rarely causes issues.
Airlines make it worse, because they strip both characters during sanity checking, so my name comes out as 'Lon', which has caused me problems a couple of times as the name on my passport did not match the name on the ticket.
What these things all reinforce is that a lot of programmers take text encoding as a given, and don’t realize all the potential places for errors to sneak in.
Could be a fun way to hunt for buffer overflows on internal shipping services. Just fill out the sender name field to just "óóóóóóóóóóóóóóóóóóóóóóó" and let it expand. If the parcel arrives, not vulnerable. If the packet doesn't arrive, you've found a vulnerability... somewhere...
I wouldn't say an accent "causes" UTF-8 encoding issues. If acute accents are a problem, then UTF-8 handling has completely failed.
It is amazing to me where I see failed encoding like that. For instance, many SEC filings and job ads for tech companies. I mean, I feel like I'm expected to spell things correctly on my resume and emails at work...
latin1 is the default for text, including HTML, if you don't specify in protocols such as HTTP (modulo some stupidity from the WHATWG where it might be Win-1252 instead) and Windows-1252 is the default encoding in Windows in the USA (at least, prior to the Unicode APIs being added. The old APIs probably still exist though…). So these codecs pop up a lot in places where people who don't know what they're doing end up touching text.
The WHATWG HTML spec requires UTF-8 for conforming documents and scripts [WHATWG 4.2.5.4]. In both HTML specs, charset declarations, if provided, must be UTF-8 [4.2.5].
If the transport, content-type, lack of charset declaration, and sniffing fail to determine an encoding, both specs use defaults based on the configured locale, for English that's windows-1252 [WHATWG: 12.2.3.2 W2C: 8.2.2.2]. latin1/ISO-8859-1 is prohibited. [WHATWG: 12.2.3.3 W3C: 8.2.2.3].
I ran across some code once for descrambling data that had been incorrectly processed like that, which I found common in legal documents. It's an interesting problem, because strictly speaking, it's lossy, but you can use probabilities to figure out something plausible. You can decode/encode one thing as another, or you can decode/encode multiple times...
Any chance you have a link? I’ve had implement solutions to this myself and it’s very tedious. If someone has built a more complete solution I would love to just use that instead
That might be what I'm remembering; then again, I don't really do Python, so maybe it was something else. I doubt it was anything better than the link above, regardless.
You could try inputting your name as [Latin Small Letter E][Combining Acute Accent]:
e◌́
=>
é
Which should keep the `e` intact, while the combining acute accent (0xCC 0x81) may "only" get converted to a `Ì` which may be stripped. 0x81 is undefined in Windows-1252, so I have no idea what would happen to that, but probably be stripped as well, keeping just Leon.
Unless someone decides to NFC-normalize the text along the way. And it's generally agreed that text should be normalized with NFC, although there is often a fierce debate about who should do it ("not me").
Reminds me of the times when Amazon failed to reproduce the ü in my last name on their shippig labels. They consistently printed the UTF-8 encoded character interpreted as 8 bit ASCII sequence. That bug was present for a couple of years.
I did some research a few weeks back about why I have an apostrophe in my name. When the British conquered the Irish they started keeping records of the citizenry. The Ó used in Irish names to track descent was eschewed for O' in British record keeping.
Just because the engineers know that there's a tricky problem with input validation doesn't mean the business people want to take the time to solve it, unfortunately.
As a general rule, the earlier an industry automated their systems, the worse the implementation.
In the earlier decades of computing, the software industry really just didn't know what it was doing too much. (Not that we're perfect now, but we were even worse back then.)
And replacing an implementation is a huge undertaking, and a lot of industries just don't bother.
This leads to a paradoxical situation where industries that most obviously need automation are the ones that have the worst automation. Before others, they pushed to get it done, and they got locked in to something primitive and/or outmoded.
I mean, it doesn't hurt that hardware now is so fast and scalable that performance and efficiency can take a back seat to usability and clarity. Guarantee this was done because storing the title as part of the first name field saved a few bytes over having a separate field, and that really mattered in the 70s when these systems were designed.
Yeah, it is a mixture of causes. Sometimes hardware constraints really did force software into tough choices.
The C programming language might be a good example. From a modern perspective, requiring forward declarations seems like pointless busywork. At the time it was written, decisions like that made it possible to have a one-pass compiler, which was an important efficiency gain. You could reduce I/O and maybe save RAM.
Based on the top answer, this appears to be an error in the CLI "helpfully" re-interpreting Amr as { First name: A, Title: Mr }, not an issue on the data storage side.
> it doesn't hurt that hardware now is so fast and scalable that performance and efficiency can take a back seat to usability and clarity.
Agree. Unfortunately , the users of even the new usable systems may be loath to take up these changes. Why? I guess inertia and priorities. My mind goes back to QWERTY v/s Dvorak and how it panned out.
The top answer on StackExchange alludes to this, too.
My mind goes back to QWERTY v/s Dvorak and how it panned out.
Sometimes the new system isn't better so there's no reason to switch... AFAIK, Dvorak is not better than QWERTY (despite claims to the contrary).
I worked in IT at a hotel company when they made the cutover from a 3270 based text system to an early 90's modern Windows GUI -- agents hated it, even with command shortcuts it was slower than the old interface. Training new agents was faster, but experienced agents were much slower.
Yes, but in many cases the rules were not fixed also. Agile is a mindset and philosophy. Whilst not a panacea, i feel it does help reduce inertia in organisations that adopt it.
Yes, but in many cases the rules were not fixed also
What does this have to do with Agile? You either have adequate resources (human and money) and desire to fix problems or not. Absent that issues and bugs are lingering for years in organizations that employ agile. I saw it with my own eyes.
Agile doesn't help you here. It might only take a day to fix this issue in your code base. But then it will take months to get it tested and approved. And then it has to be rolled out into the organization and periodically analyzed during the rollout until it's done. Given how large airlines are and how many people interface with an airline booking system Agile's "contribution" to development timing is meaningless here. You could spend a day making a fix and a couple years watching that fix slowly percolate into the organization, followed by another couple of years waiting for all the travel agents to stop calling and complaining about how this impacts the workflow they've had for the last 20 years.
Look up two-digit date fields, packed-decimal, zoned-decimal, bitfields, and other encoded-field datatypes. All were explicitly created to save on data storage, when the main transmission medium was punch cards.
The rationalisation is not conjecture. It's fact. Bits and baud were expensive.
And that's before getting to bitshifted storage of software and similar tricks.
SABRE dates from the 1950s. Which was a long time ago in Internet Years.
The computer it was based, the IBM 7090, on had memory storage of 32,768 words of 36 bits, about 64 KB using today's 16-bit byte. It operated at 100 kflops/s. A modern AMD-64 CPU tends around 4-64 flops per cycle, or in the neighbourhood of 4-250 gigaflops/s, up to about 2 billion times faster.
I've programmed on punched cards. System/360, FORTRAN77. Carter administration.
Whether history exists or not, is not the question. Whether byte misers from long ago caused this name-mangling issue, that's the conjecture part. At least one aspect of the guy's complaint was of recent origin, and ironically that was from arbitrarily insisting the name be longer than it is.
The SE answer links to the relevant documentation. Spaces are indeed optional. It's not explicitly documented that a non-whitespaced 'MR' is read as a title, but it seems likely.
I was prevented from doing an advance online check-in with Emirates (while overseas) because - I was told later - my son and I have the same (first/last) name and their system couldn't handle it. Subsequently, the flight was overbooked and we got bumped (which would have been avoided had I been able to use their online advance check-in).
I still can't get my head around how their online check-in system was setup where this could happen.
This is strange as airlines usually allow you to enter the PNR (booking reference) and any of the names in the booking. The implementation is usually that they lookup your booking from the PNR then effectively regex to see if your name is inside one of the name lines in the record. I can't think of a reason why this wouldn't work unless it was a connecting journey and the carrier didn't have the correct permissions on the the record.
Well, that's certainly plausible. Though I was told - after escalating the issue - that their system had an "issue with our names being the same" causing some bug in their online system.
Normally our middle initials are carried forward to the carrier's booking system and that differentiates my son and I when we fly together... but in Emirates case it seems they weren't.
So it probably means that a "dynamically" typed system is used. Is there a database where if you INSERT the string 'TRUE' you get an actual boolean stored? IIRC MySQL used to not actually coerce types into the type the column is declared as.
A simpler case is Hispanic surnames. Suppose I'm writer Miguel de Cervantes Saavedra. From airline tickets to name tags in networking/speaking engagement programs, I show up as Miguel Saavedra. But I'm Cervantes.
I'm not Cervantes and I suppose I'm a little patriarchal, but I'm very proud of my dad's achievements and wear his name (whenever they let me) with pride.
My name is John O'Brien. I can't tell you the number of times using my name breaks web forms.
One time at work, I insisted on getting access to a system I manage. When they added my name, the whole system came down for a day in production. I told them never mind, they deleted my name from the list of registered users, and the system came back up.
Congress should pass the Little Bobby Tables Act, that outlaws SQL injection bugs, legalizes the use of full Unicode in all names, and requires all software to be updated to fully support its requirements.
Then you could give your kids cool names with colorful emojis, and even invisible &ZWNBSP; zero-width no-break space word joiners and &SHY; soft hyphens!
Legislators tend to do a very bad job of creating technical standards. There is no constitutionally-enumerated power for Congress to regulate which character-encoding standard is used. On the other hand, updating all gov't and military systems to unicode-only would force the change pretty quickly, I'd guess.
You also can't outlaw a vulnerability. It's an honest development mistake.
I assume you're joking about emojis and other esoteric characters?
> There is no constitutionally-enumerated power for Congress to regulate which character-encoding standard is used.
If you squint, it's kind of covered by the weights and measures clause, in the same way that the clause that allows Congress to establish armies covers Congress establishing the Air Force.
You don't really need to squint. A combination of the weights and measures clause, the interstate commerce clause, the federal postal service, is more than enough for Congress to have jursidiction on how names are to be encoded and processed.
> You also can't outlaw a vulnerability. It's an honest development mistake.
You can, however, threaten executives with personally-served jail time if they lead a company in a sufficiently irresponsible manner to allow such a mistake to happen, and can't somehow demonstrate that the measures they implemented were reasonable (despite the prima facie evidence that they weren't, because they failed).
Not saying this should be done in this case, but AFAIK some of the Sarbanes-Oxley Act requirements are being taken very seriously because they have similar provisions.
Where they have a reasonable way to avoid it, I'm all for it to be honest. Engineers go to jail if they build a shoddy bridge.
Key is to require realistic things. Not "humans must not make mistakes", but "process needs to be in place to catch human mistakes". This is not something individual contributors can usually influence.
But in cases like e.g. the VW emission faking scandal, I do think that if (and that's a big if) it can be proven that the developer must have reasonably known what he was doing and that it was part of an illegal scheme, the dev should also be punished: This changes the game from "do it and keep my job, or refuse/whistleblow and lose my job" to "do it and potentially go to jail, or refuse/whistleblow and lose my job" making the second, societally preferable outcome more likely.
I would't count a SQLI as a honest mistake, at least not for newer systems. Any developer not knowing about this has no right to have his code run productively. A couple of times I had a quite hard time to convince people to fix their SQL code (in a framework, no less). It would have been easier if it had been outlawed, I am sure.
My official full name is too long, I just have a second and a third first name with are all not so short. This is not uncommon here in Germany and therefore usually not a problem. But a lot of foreign IT systems where it says "put it exactly like in your official document" it doesn't work or it cuts it of. Witch made my third name once "Al". It also happened that it took my second name as my last name....
The German state isn't much better with foreign names. I have a lot of friends who are from eastern europe. Everybody has a weird story about how their current passport name came about.
Why isn't there some ISO name parsing standard that can understand a weird title of a prince of a city state and give you an appropriate output of it?
This is acutally a research topic, so the information is right there.
The standard could have fallback field, where you just put your UTF-8 string in there and then the system knows it's something that doesn't fit in all the other categories.
Better than not trying at least, the current situation just sucks.
I single string field, for how they want to be addressed. If the system requires a legal name then two string fields, the legal name, and the preferred name to be addressed by.
I'm hardly a professional but what's the big issue with just having a string field for the name. Why separate first last middle suffixes etc. Seems entirely pointless. When you want to search for someone you can just search for a substring instead of a particular field.
First, Legal Name, as required to appear on legal forms in their area. (And oftentimes if they don't fill this in correctly it'll then be on them, and not on you.)
Second, Preferred Name, so that you can throw that into greetings and whatever else you feel the need to personalise.
Even a “legal name” can be different things in the same legal jurisdiction, or not be a concept that really exists at all. I’m in the USA and my birth certificate doesn’t have distinct first, middle, and last name fields. It’s just a single line with my whole name typed out.
That's sort of my point. The Legal Name field is a single field, like it apparently is on your birth certificate. It's just a byte-stuffer identifier filled with something that can be used by the business for any legal forms that come up.
The individual parts inside don't, and shouldn't, matter.
Also, long-standing common law precedent in the US is that people can call themselves whatever they like for any reason at any time as long as it's not with intent to deceive or defraud. This is the basic principle behind, for example, businesses run using stage or pen names (see: "Prince").
Even stranger is that birth certificates are the foundation for all identity in the United States, yet there are >14,000 forms and very poor standards around them.
Well, the number of middle initials/names that I used in the past is dependent on the constraints of the particular system, so I have no possible way to be consistent going forward. I would rather not use any at all, but Social Security and credit cards have one, while my credit union has the other, the DMV and my birth certificate have both, and I use neither when it doesn't seem important. It's definitely unreasonable to use my full legal name for everything, so I think at least three would be needed.
So long as the business gets something that they can use as a legal identifier, they don't need to care about what is inside it.
The whole problem of people not having middle or family names goes away when it's treated as a single field. You don't care about tokenizing the field, because you only ever reproduce that field verbatim.
If the user believes a middle name identifier isn't necessary, then it probably isn't. It's only one aspect of the identifiable information used to identify the legal entity of that person anyway. If you have a name, address, possibly CC information, and email, that's probably more than enough to claim they're a customer for whatever legal purpose the business needs.
The problem here is that you might need more than one "name" like string. One goes into the address of an envelope, but if the letter starts with "Dear" or "Hello", then what follows is usually a different name.
And you can't extract any piece of the "envelope" name to give you the "hello" name: For a person named John Smith, in some cultures it's normal to say "hello John", in others it's "hello Mr. Smith", in still others it's "hello Smith".
And of course, the full name could be "John Smith" or "Smith John" depending on culture. And then there are cultures that have more than one "word" to a name, but there is no family name.
One solution to the "hello name" would be to ask for it separately. Every time you're tempted to parse or rearrange a name, just don't, and ask the user instead, and make a new column or field to put it in. Now that's a paradigm that might stand a chance of winning database. Although ironically your correct rendering of everyone's name will doubtless create mismatches with all the officially sanctioned incorrect renderings by other inferior mortals out there.
Seems like even in the most varied case there wouldn't be more than about three: full legal name, brief-but-formal name such as Mrs. Smith or Dr. Jones or Lawrence, and informal name such as Larry.
- First initial and last name are used for some things, such as usernames.
- First name and last initial are used for some things, such as public displays where you don't want to have the full name
- Last name is used to sort lists
- Names are entered differently due to error or constraints in different places, so you want a way to search the parts that stay the same. Sure, you can imagine everybody just had a string field, but it can't happen all at once. And on a global scale, there are definitely people who will enter their name differently anyway.
How do these usernames / public displays work for people without two names? When you sort by "last name" do you really mean "surname"? Are you going to add more fields for searching and sorting keys, so when 鈴木一朗 comes along you can search by "イチロー" or "SUZUKI" or 4 or 5 other reasonable search terms, and where is he going to be sorted?
It sounds like you're just kicking the problem downstream a tiny bit, and making it slightly worse by adding another point which enforces an invalid data model.
Your last sentence is certainly true but I don't see how separate fields makes it any better (or worse).
There are people who only have one name, so anything that relies on having two names is essentially broken.
You commonly see companies with user names that started as first character of first name + last name, and then you see later ones that don't follow the pattern when they realize they can't make it work.
One point in favor of the separate fields is that users in general suck at computer input.[0] So we have a user entering her name into our website. And then we need to compare that against some database into which she entered her name five years ago. If we have one freeform field, then we have to deal with small inconsistencies that may or may not be significant.
Say the user has three given names and a family name. For whatever reason, she puts double spaces surrounding the two middle names, but she didn't last time. Or jams them together, maybe with another punctuation. Are those semantically meaningful differences? Do the spaces indicate we have two separate people? Would a shortened form for one of the names be meaningful? Or if she misspells the abbreviation for "Marquise"; is that part of her name now? Etc.
If we're accepting that names can have infinitely varied forms, and that whatever the user enters right now is canonical, then questions like that become very difficult. Normalization for comparison is a real requirement in some cases, and separate name fields give us a leg up on that.
I'm not saying this is insurmountable, or that we should make it the user's problem. Just that the single field doesn't quite solve everything in one fell swoop.
---
[0]:I don't say this to blame the users -- computers overall could almost certainly do more to not leak the details of storage into the front end where users have to deal with it.
> Say the user has three given names and a family name. For whatever reason, she puts double spaces surrounding the two middle names, but she didn't last time. Or jams them together, maybe with another punctuation. Are those semantically meaningful differences? Do the spaces indicate we have two separate people? Would a shortened form for one of the names be meaningful? Or if she misspells the abbreviation for "Marquise"; is that part of her name now? Etc.
Names can be extraordinarily long or just a single char. Theres a lot of stories where serious issues are caused by software systems and their implementation of handling names.
To answer your question: I dont know if theres a good single spec to follow.
Since hyphenating my last name, I can no longer check into AA flights without human intervention.. minor inconvenience I guess, but annoying.
In general it seems we make bad assumptions about names - like that everyone has a "first name" and a "last name" to enter into two separate database fields.
some (cof cof united) airlines put my first surname as my middle name. i don't know why. and guess what? i can't check-in because there's no middle name on my passport. every time this happens, i call the airline, they say it's fine. but i cannot, for the life of me, check-in without an employee helping me.
I’m baffled by the amount of trouble something as simple as changing a last name can cause. A meaningful percentage of people do it, and yet it’s handled incredibly poorly by many systems.
I work with people from India who don't have a Western first-name last-name structure and have gotten very used to seeing their full multipart names written in both the first and last field. It seems a common workaround for these folks with names that break the system - just write their full name in all name fields.
Nothing on the level of inconvenience that man experiences, but the other day I went to the Argos to buy some stuff and they have a new system where they ask you to input your first name as an identifier for your order.
So I entered mine, which is "stassa", and ... I got an error message saying my "memorable word is invalid". I got a screenshot and all:
The error message says "Sorry, your memorable word is invalid; please enter a different word".
So I entered "alice" and that was accepted. I guess, they originally had a message prompting the user for a "memorable word" not to enter their "first name". I didn't try it because I was in a hurry but I bet their system is setup to accept only dictionary words, but then they realised that most people will just enter whatever comes to their mind and often fail to match the dictionary that (I assume) the Argos developers have chosen as "standard".
So now they're asking people to enter their names... and that also fails because people often have names that are not in the dictionary. Like mine that just happens to be foreign (for the UK; it's a Greek name).
I just imagine the looks of confusion on people called Sandeep (not a British English dictionary word) or LaRonda (not one either) or Marek (also not) etc. "But, it says my name is invalid. What do I do?".
Automation, eh?
Edit: I wasn't buying what's partly showing on the left of the image, btw :)
To this day I can't create an e-mail account on outlook.com with my full name as local part because Microsoft has a profanity filter and my lastname (that is my username here) contains a forbidden word "nude". Even though I was using the French version where "nude" isn't even a word here.
The worst part is that Outlook only told me that the full email address was "invalid", and that's by trying every frigging combination that I figured out that the "nude" part was the problem. If I were to remove only the letter "u", everything goes fine and the address is created. But who wants to have an address with a butchered lastname that would just create more confusion when spelling it.
I tried to contact Microsoft support by chat or forum, but most of the people either didn't understand or replied they couldn't bypass this profanity filter.
You think that is bad, imagine having a name that has an apostrophe and an accented character. I think it's a few times a month a friend of mine complains about not being able to use their name.
While at school I went on a group booked international trip with most of the year, including someone unrelated but with the same surname.
In both directions we had to check-in together and claim to be brothers, because whatever booking system had somehow assumed we were, and for whatever reason it was more important (or perhaps easier and without consequence) (the staff told us) to go along with that booking mistake than to be honest.
Last year, I integrated a Zuora payment iframe into a popular sports streaming product. We had a customer who couldn’t sign up, because his surname contained the word resize, which is a command that the iframe sends to its host script to adjust the iframe’s height. It was some of the most rubbish code I’ve ever seen. Zuora continued to deny it was their fault, so I doubt it’s been fixed. I worked around it by uppercasing the cardholder’s name.
My last name causes lots of problems in China. Is there a space after the Mc or not? They’ve decided there must be because my passport has one, woe is me if I leave it out (I’ve been in banking hell a few times because of this). Also, your given name is your first name + middle name.
I don't understand why this isn't causing him serious problems travelling.
I'm also surprised this isn't a more widely recognised phenomenon, because whilst there are very few first names ending mr, there are quite a few more ending miss or ms.
> why this isn't causing him serious problems travelling
Because people in the business are used to idiosyncrasies of the software systems and know that the goal is to exchange money for services, not to appease the program.
But, as the author notes, a system now has programmatic restrictions.
I thought as a joke that the API is so old, that they didn’t treat women passengers with the same formality. Well... that might be true. In the documentation linked, I see no example of “MS” or “MISS”.
>Using a space, the parsing is unambiguous, however not all agents put a space, thus if instead the agent types:
>Then the command will be parsed as (NM, 1, ELADAWY, A, MR) to be "helpful".
And that's why doing some things for "convenience's sake" is BS. If a travel agent can't type a space between the name and the title then they shouldn't be a travel agent
Or if they don't double check it was entered correctly (which ok, in some terminals might have been a bit annoying)
I have two last names, GDSs just concatenates them together. Weird, but it works.
Seems like the only way to get them to fix this is to make a fuss about how it's racist. You shouldn't have to, but that's what I would do. That'll get them to pay attention.
Are there any public test data sets of name corner cases? Given the popular "falsehoods programmers believe" lists, someone could create a public data set of unsanitized name inputs, expected decompositions, and expected round trip result. I think genealogy organizations have published de facto standards for name formatting.
I too have always wondered why my name can sometimes get mangled between what I type in and what arrived on my ticket. Fascinating insight into the airline industry.
Airlines I fly with a lot I know how to type in my name into their boxes in order to generate the right ticket name. If they only have a first name, last name box where do I put my middle name and/or Jr? Some force me to pick a Mr./Mrs/Miss/etc..., and then put it on the ticket...sigh.
I know twin brothers who have a long name (very common in some parts of India). So when they moved to US apparently they both were issued same SSN number because the SSN system registered the name backwards ( last -> middle -> first) which exceeded the limit and the first name which was unique got changed to FNU (First name undefined).
It took them years to figure out they both had same SSN.
My wife has a hyphenated first name and no middle name, Like Mary-Claire Smith. It's very annoying for all kinds of systems, and usually results in some kind of compromise like separating the hyphenated name into first and middle, which isn't a huge deal, but does not match her documents and isn't really her name. Ugh.
One I don't see often in these is where one didn't refer to a given person by their name at all.
It doesn't matter if they're in the system as "Elizabeth Windsor", if you don't send them their account statement addressed to "Their Royal Highness" they'll switch to a company which does...
My middle name is my first name, and my given name is my middle name. So it's "Middle Given Surname". It's common where I'm from and on my official identification the given name is usually just underlined. I've had no end of issues since moving to the UK however.
What about doubling the 'mr' in the end of your name ?
It should be (I say that without any serious confidence) possible to explain the probleme in case of someone wanting to know why you ticket name is'nt exactly matching your id document.
He made his own font, and distributed it on floppy disks to members of the press, along with complete step-by-step instruction for how to refer to the artist by name in print, and how install the font on Mac and PC!
>[P] Background: On June 7, 1993, mega-star Prince surprised fans and the entertainment industry when he announced that he was separating from his band New Power Generation and changing his name. At the time, the performer legally changed his name top the symbol [P], which has no verbal pronunciation or spelling. He did not reveal his reason for the change.
Or MrAmr which, assuming Amr is a man, would be technically not wrong when he enters his name as “Mr Amr” and would then be (correctly) corrected by the system dropping the first instance of “mr” in the string.
It seems the solution is for him to but mr infront of or after his name. Another solution to a part of a name getting filtered out is to write it many times. For example ammrr turns into amr if mr is removed.
In the US at least courts have ruled that although you have a right to travel such as by foot, your right to travel by car or plane etc is not an absolute right but is a privilege subject to regulation.
To travel on a plane your booking name must match your government id.
Old systems written in COBOL have been tried to be updated countless times over the last 70 years and it's simply not possible technologically.
Therefore if this gentleman wishes to travel he will need to change his legal name.
It sucks but this is the price we pay for freedom and safety.
It's not Amar though. Amr is a very common Egyptian name. And it's literally sounds like ah-mr. There's no vowel you can put between m and r to make that sound.
If that means that they can no longer represent reality, then that's a bug. "We've gone 50 years without actually updating our systems to fix problems" is a less-than-impressive response.
Which is what this whole discussion is about. As well as understanding the root cause of such bugs (in-band signaling, etc), to start to fix them ideally.
I've also been issued a driver's license for a "4TH". I have no idea how TSA would ever spot a fake. (Since they don't flag me! But I'm also in a demographic that tends to get passes in the security theatre…)