smithvoice.com
 Y'herd thisun? 

“Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems”
-Jamie Zawinski


from comp.lang.emacs

Simple person name regex

TaggedCoding, CSharp, ASP.Net

"Do you have a good regex to test a person name part entry?"

Yep.  ^[A-Za-z]+((-[A-Za-z]+)|('[A-Za-z]+)|(\040[A-Za-z]+))*$

string value = textBox1.Text;
string tester = @"^[A-Za-z]+((-[A-Za-z]+)|('[A-Za-z]+)|( [A-Za-z]+))*$";
bool isMatch = new Regex(tester).IsMatch(value);
MessageBox.Show("ismatch: " + isMatch.ToString());

If you look closely you'll see that the example shows the use of a single space in the last OR part instead of the \040, they are the same thing but using the specific \040 is better because it's easier to immediately see in your editor that you are looking for just that character so you're less likely to accidentally fatfinger a doublespace and not notice till data gets past it. Most times folks use the \s which signifies ALL whitespace including tabs and formfeeds but for this spec of person names you really do only want to be allowing the blank space; if you used \s then a user copy/pasting could insert a tab, which you don't want but the use of \s would consider a valid match.

In any case, the pattern just allows alphas plus if the string has a single hyphen or a single quote or a single blank space between alphas then it passes.  Those special characters can't be at the front or the end of the string and they have to be surrounded by alphas.  So O'Leary works but O''Leary or O'-Leary doesn't.  von Braun works but von 'Braun doesn't.

You know not to get in the trap of trying to use regex to matchor autofix proper casing... code logic or trusting the user to take care of casing is the best for that because of all the variations; DeFosses is valid but Denny isn't?  Good luck making a regex return quickly with all the ways people like their names.

Also, if you're doing a hiphop artist database then you'll have to extend the logic to include all kinds of extra crap... hope that's not your need :-).  

"great stuff thx.  actually we were looking for a clientside test of propercasing.  thinking about it does seem to make it more difficult"

Yeah there are a lot of problems with that route.  Unless you know exactly what browser the users are running and are positive that all the data names are for Western-European stocked 3rd generation American people born in Iowa in the 1930's you're in for a full time job of dealing with complaints over performance and results.  You're adding the variations of regex support on different platforms and browsers to the same variations in javascript support and topping it off with the nearly unlimited recursive switch cases you'll hit once the real-world names start getting added.  

That's why doing the fast and simple obvious validation on the client but doing the real fun stuff on your known and controlled backend is best.  You have to do the serverside completely anyway including all the same checks you do on the client so it's not like you can avoid touching the server.  

In a perfect world where you are in the project from the gitgo you push for adding a bool/bit sibling to each name column. With that you can have fun with logic to do standard formatting in code but if the results don't appeal to the user they can chose to override with their own literal entry; The bit sets whether later code forces the automatic formatting or ignores the value in the co-field.  

Another way from the olden days is to add co-columns for each field, using your formatting logic automatically on the sibling field while letting the user's entries directly be added to the datastore as-is.  You see this alot in older systems if only to make indexing better by forcing entries to be all uppercased or all lowercased and stripped of punctuation.

"it's such a data heavy form, I'd hate to post it back just for those few properties.  Wait, I can expose the logic s a service and use Ajax to do the formatting and show the user a 'Did you mean this?' prompt.  DUH!!!!!  THANK YOU!"

You beat me to it :)



home     who is smith    contact smith     rss feed π
Since 1997 a place for my stuff, and it if helps you too then all the better smithvoice.com