Contains operator vs. diacriticals (was The nature of the separator appears to affect the result of a formulafill)

In this code snippet, mgFirstMods and mgFirstChars are fields and mgFirstChars contains the characters, “oü” in one record:

local alpha
alpha = "olϪonϪooϪorϪotϪouϪovϪow"
field mgFirstMods formulafill ?(alpha contains mgFirstChars,mgFirstChars,"Not found")

results in the “oü” value remaining unchanged. But this:

local alpha
alpha = "ol;on;oo;or;ot;ou;ov;ow"
field mgFirstMods formulafill ?(alpha contains mgFirstChars,mgFirstChars,"Not found")

results in the “oü” value being changed to “Not found”.

The separator in the first instance is chr(1002), used to avoid conflict with the contents of alpha (which is very much larger in the real world).

I get the same result on each of two computers. What’s happening here?

This has nothing to do with formulafill. The issue can be demonstrated with a very simple formula.

"ouϪ" contains "oü"

This formula should be false, but if you try it out in the Formula Workshop, it returns true.

I think this is because of the intricacies of Unicode, and the fact that the contains operator is case insensitive. Panorama uses an Apple API to check whether one string is contained in another, and it appears that this API gets confused by diacriticals if the Ϫ character is used as a separator. If I used Ώ instead, it worked ok.

If you really want to use the Ϫ character, you have options. You could use matchexact and wildcards, this works fine (returns false).

"ouϪ" matchexact "*oü*"

I think the best solution is to use the arraycontains( function.

arraycontains("olϪonϪooϪorϪotϪouϪovϪow","*oö*","Ϫ")

This works fine, is probably faster, and will work even if you have data in the field that contains less than 2 characters.


When you run into a problem, it’s always a good idea to try to reduce it to the absolute minimum example. That usually makes it easier to figure out.

Well done Holmes :slight_smile:

Problem solved.

That’s fine for that case but both of these give a result of true:

message arraycontains("sa,so,s-,st","s?",",")
message arraycontains("sa,so,s-,st","s*",",")

… so it seems that the arraycontains( function is allowing wild cards, although the documentation makes no mention of it.

It does so indirectly, but not directly.

Note: This function is equivalent to:

arraysearch(thearray,thetext,1,thesep)>0

The documentation for arraysearch( mentions the wild cards.

That’s tricky. I’ll take Jim’s earlier advice and use data arrays.

I’d previously avoided them because my arrays are created using arrayselectedbuild so I have to create text arrays and then convert them to data arrays but that’s not a big deal.

You should look into the dataarraybuild( function.

I should have known there’d be one.

Thanks again Jim.