Sorry to have explained this poorly.
I’ve created a file using only the invoice identifiers/numbers in the original data set (this is the information I am using to determine whether a record is duplicate or not) and no other data. I’d be happy to provide that file (about 6 Mb). It exhibits the same behavior.
Thus, we can focus on only the invoice identifiers. This column contains multiple identical records, with a total of nearly 220,000 records. Of that, there are nearly 20,000 unique records (I evaluated the data outside of Panorama). When using selectduplicates in a procedure, Panorama only identifies less than 400 unique records.
The procedure I am using to select the duplicates follows:
I have a subset of the data in a Panorama file which I can send you. It has the procedure code in it and, on my system demonstrates the behavior I am trying to relate.