In my experience the “old” Select Duplicates and the newer version which takes one to a window to enter formulae etc (which I haven’t done) always selects duplicates from the whole database. I want it to only, ever select duplicates in the currently selected set. If I want it to apply to all, I’ll have all records showing, but mostly I want to select duplicates in a subset of records, simply or by use of a formula if I need to. But if by formula, I would like it to be generic so I can make a procedure and just run it quickly on whatever cell I happen to be sitting on.
I read/scanned the “manual” on Select Duplicates Using a Formula, but didn’t find the answer. There is a reference to selecting duplicates based on … partial fields. I’m not familiar with “partial fields” but in context it seems to mean a field like firstname that is then used in other fields to build a name. I had hoped it meant a subset of the records.
It’s not possible to select duplicates from a subset of records. That’s simply not a feature Panorama offers. The only way to do that is to delete the unselected records, and then select duplicates.
Gary’s suggestion will not work. The formula used with select duplicates is not a boolean formula, it’s a formula that transforms one or more fields before using the resulting value to check for duplicates. For example you could use the formula to select duplicates based on full names, even if the first and last name are stored in separate fields. But the formula cannot be used to select particular records.
It does occur to me that there’s another possible way to do this - reverse the order. For example, suppose you want to select duplicates in California. Start by selecting duplicates, then use select within to choose the subset of duplicates in California.
My approach would be to use arrayselectedbuild to build a two-dimensional array in which the first column contains the field(s) which might be duplicated, and the second is the key field for that database (therefore guaranteed to be unique for each record — if necessary, create a temporary field and formulafill it with seq()). Then sort that array which will ensure that any duplicates in the first column will be in consecutive fields. Then use a combination of arrayfilter(, cache( and a regular expression to blank any entries except those in which the first array element is the same as in the row above. Then arraystrip to remove those blanked lines, leaving a smaller 2D array of those entries in which are surplus to requirements. The second column of that array is the list of keys of those duplicate fields, so that can be used to do a select, e.g., provided the key field is known to use regex-friendly characters, transform that second column into a |-separated array in which each element is sandwiched by ^ and $, then select something in the form of KeyField regexmatch “^1$|^4$|^93$”,to select those records (select records in which KeyField is 1, 4 or 93).
That all sounds more complicated than it actually is in practice. I‘ve done something very similar in the past which I could easily adapt. Unfortunately at the moment I‘m restoring my Mac from a Time Machine backup after a hard drive failure, so have no access to Panorama.
Also, it‘s not quite the same as Select Duplicates, because that would give you all instances of each duplicate, whereas this will give you every instance but one of each duplicate. Sometimes that might be useful, sometimes not. With a bit more thought about the arrayfilter function it ought to be possible to replicate Select Duplicates more closely though.
Can’t the person Select Duplicates first - can’t remember if it automatically sorts the field or if you have to sort it first - then do a SelectAdditional on those duplicates to eliminate the ones not wanted?
In other words, change the position of cart and the horse.
The problem with that is if a duplicate group from the full database has only one record within your target selection. That single record would be listed in your final selection, but you don’t want it.
I think you can do this by adding an extra field. Say you wanted to find duplicates within the current selection of Field A. Addfield “B”;make it similar type
Fill B with an appropriate fixed string|value, if selected, and either ““ or 0 if not selected. Appropriate in that for no record would that fixed string|value, added to its value, be found as a value within the current data. For text data use some unused exotic unicode character as the fixed string. For numeric data use something exotic like pi() or something larger than the range of your data for the fixed value, eg.