Searching with regular expressions

I am a total regex newbie trying to use regex to help clean up an address file. The file (exported from Endicia) has company names mixed in with addresses in the same field.

All the addresses begin with digits, so I’d like to either select all the records that begin with letters [A-Z] or alternately all the records that begin with digits. Searching online I see the carat ^ used to match the beginning of a string in regex but that doesn’t seem to fly in Panorama.

Thanks in advance,
Barry

While it might be nice to learn to use regular expressions, you do not need to for this. Just look at the first letter of the field and select those which are between “1” and “9”, assuming that no address begins with 0. That should be a search on strings, not values.

Select striptonum(FieldName[1,1])≠""

How do I search/select a range of numbers?

More specifically, how do I correctly write:

Select records beginning with 1-9 in field X ?

To answer your regex question:

regexarray("Apple Computer, Inc. 1 Infinite Loop Cupertino, CA 95014","^[A-Z ,.]+")
       returns "Apple Computer, Inc."

and:

regexarray("Apple Computer, Inc. 1 Infinite Loop Cupertino, CA 95014","\d[A-Z ,0-9]+")
       returns "1 Infinite Loop Cupertino, CA 95014"

Select X[“1-9”,1]≠""

Making progress after stripping leading spaces that I didn’t realize were in the field. Sigh.

New questions:

  1. How best to strip out/delete the zip codes from records like this:

NEW YORK NY 10032-4434

I thought of striptoalpha but that also deletes the spaces.

  1. Easiest way to insert a dash to correctly format zip codes? Currently they look like this: 770694525

thanks!
b

arraylast(“NEW YORK NY 10032-4434”," ") returns “10032-4434”
“NEW YORK NY 10032-4434”[1,“≠-0-9, ,-”] returns “NEW YORK NY”

“770694525”[1,-5]+"-"+“770694525”[-4,-1] returns “77069-4525”

All fixed. Whew! Thanks everyone!

There never was an answer that I can see on how to search with Regex. There are examples of how to search a block of text, but what about searching a database?

How would I apply Regex 800|801|802 to find Denver zip codes in a field of zip codes?

More critically, how do I apply:

C:[^:]BPM[^:]:C.M:[^:](AAD|ACE)[^:]:M.R:[^:](V1|V2|V3|VM)[^:]:R.*A:[^:]C[^:]:A

to search a field containing thousands of records like:

C:BPM:C M:CHO:M R:NX:R A:C:A U:H:U O:S:O S:A:S
C:BPM:C M:CHO:M R:NX:R A:C:A U:H:U O:S:O S:A:S D:V:D
C:SPO:C M:CHO:M R:EP NX:R A:C:A U:H:U O:S:O S:F:S F:094:

I’ve been looking for ways to utilize regexarray( but haven’t found the right configuration so far.

I think you want the regexmatch operator for this. It returns a true or false result, so you can use it as the condition in a Select or Find.

Select Zip regexmatch "800|801|802"

Actually, I think that should be

Select Zip regexmatch "\b800|\b801|\b802"

so that it only matches if that’s the beginning of the zip code. The first one would match my zip code (61801).

Right you are. I was hurrying to put together a simple Regex example, and I had some doubts about my accuracy. It made the point though, helped me learn a touch more, showed the right way to do a search, and left an example for others down the road. Thanks, David.

\b matches from the beginning of a word, so this formula would match -801, or 801 with a space on it. I think better would be:

select Zip regexmatch "^(800|801|802)"

Or, you in this case you could do:

select Zip regexmatch "^80(0|1|2)"

or

select Zip regexmatch "^80[0-2]"

I’m just learning to swim in Regex and have immediately been thrown into an ocean of it. It’s worth mentioning that your movie, Introduction to Regular Expressions, in the Panorama X Video Training is a very helpful introduction.

I’m definitely not a Regex expert. But the basics are really not that complicated. I think most Regex books and tutorials try to explain too much way too fast, I tried to avoid that in the video you mentioned. There’s a lot you can do with just about a dozen simple regex elements.