Faster way to transfer data from one DB to another?

staylor246 · November 30, 2017, 9:54pm

In my quest to migrate my Pan 6 databases to Pan X I have redesigned one database to incorporate two fields that were formerly in another database to consolidate efforts. Ultimately the “other” database will no longer be needed at all. The Pan X system is not ready for prime time, but during development I periodically import live data from several databases into their Pan X counterparts so I can work with current data in my development environment. This works well, but the step where I transfer data from the fields in the “other” database into the new fields I have created in the main database is terribly slow. The procedure uses lookup( for two fields, keyed on a key field that is common to both databases. Is there a faster way to do this data transfer? I have only had to do this on about 3 month’s worth of data each time I do it, but when it’s time to do the final update of 15 years of data it’s going to take a loooong time. Is there a better way?

gary · November 30, 2017, 11:04pm

Scott, if you are running this in the datasheet have you enclosed your code between noshow and showpage and ending with endnoshow? This speeds things up exponentially in most cases but I’m not sure how much it might help while using an enormous number of lookups. Just a thought.

staylor246 · December 1, 2017, 12:35am

Thanks for the thought, Gary. I have a form open when I run it, but it’s not the datasheet. My loop code is preceded by noshow, and the form display is static while it is running. I didn’t bother with showpage or endnoshow at the end. Would any of that make any difference? I was hoping there was some “bulk” command that might be a better thing to use here, along the lines of formulafill, but it doesn’t sound like it. The data in question are a date and a text array of tracking numbers for shipping, and there is no way to “calculate” them from other data. When I am ready to do this for real, there will be over 56,000 records to process. I"m only doing a couple hundred at a time in my tests so far.

admin · December 1, 2017, 12:55am

Other than Gary’s suggestion of noshow, I don’t see how any advice can be offered without knowing what your code actually is.

staylor246 · December 1, 2017, 12:58am

I’ll post it tomorrow, I just shut down for the day.

michael · December 1, 2017, 4:47am

Scott

You could try my Accelerated Lookup procedure, available from the Panorama Database Exchange or, if you don’t want to bother learning a new procedure, give Jim the authority to pass your databases on to me and I’ll try it out on them.

admin · December 1, 2017, 7:51am

I don’t have Scott’s database (unless he sent it a long time ago). I was just suggesting that he post the code he was using to transfer the data here.

staylor246 · December 1, 2017, 6:30pm

Here is the entirety of my code:

local lvIndex
lvIndex = 0

noshow
FirstRecord
Loop
    lvIndex = lvIndex + 1
    PayDate = lookup("Sales-Proto-Tek","Invoice No.",«Invoice No.»,«Payment Date»,"",0)
    TrackNo = lookup("Sales-Proto-Tek","Invoice No.",«Invoice No.»,Remarks,"",0)
    DownRecord
Until info("stopped")

message "Filled "+str(lvIndex)+" records."

This runs from the destination database. I first do a Select to narrow down the number of records it will process, typically about 500 records. I haven’t timed it, but it seems to take about half an hour for those records. That’s about 1% of the entire database.

I see that the post statement could accomplish this in the other direction by loading the 2 fields in one statement instead of two, would that be noticeably faster?

KJM · December 1, 2017, 7:10pm

I would not use a loop to do the lookup on a single record basis. I would use a formulafill statement to do the lookup in all records.

staylor246 · December 1, 2017, 7:21pm

I have never used formulafill and was under the impression it would put the same formula-derived value in each field. In my case every value is unique. Can you expand on how you would use formulafill in this case?

KJM · December 1, 2017, 7:34pm

Field PayDate
formulafill lookup("Sales-Proto-Tek","Invoice No.",«Invoice No.»,«Payment Date»,"",0)
Field TrackNo
formulafill lookup("Sales-Proto-Tek","Invoice No.",«Invoice No.»,Remarks,"",0)

dave · December 1, 2017, 7:48pm

I think you are confusing it with the Fill command, which evaluates the formula once, and then puts that value in every record. FormulaFill reevaluates in every record, using data from that record.

It sounds like you have a one to one relationship here, where a lookup would rarely, if ever, fail to find a match. If that is the case, FormulaFill with the SuperLookup( function, and the wrap option could make things a great deal faster. Both databases would need to be sorted by their key fields before doing that.

Field PayDate
FormulaFill superlookup("Sales-Proto-Tek",{«Invoice No.»=««Invoice No.»»},"«Payment Date»","wrap",true(),"default",0)
Field TrackNo
FormulaFill superlookup("Sales-Proto-Tek",{«Invoice No.»=««Invoice No.»»},"Remarks","wrap",true(),"default",0)

dave · December 1, 2017, 7:54pm

Post is a custom statement. The procedure it calls has about 100 lines of code, not counting comments.

staylor246 · December 1, 2017, 8:07pm

What, me confused? I will use Kurt’s approach, with your enhancements, and see how it goes. I am curious why either of these approaches give significantly faster performance, they still involve lookups one record at a time.

dave · December 1, 2017, 8:16pm

FormulaFill is simply faster than a loop. The compiled C or Objective C code can run much faster than Panorama’s interpreter language.

The advantage of superlookup( with the wrap option is that each successive lookup begins where the previous one left off. An ordinary lookup, begins each one with the first record. If the next record is the fifty thousandth, lookup( will find its match in the fifty thousandth place it looks, and superlookup( will find it in the second place it looks.

gary · December 1, 2017, 10:16pm

I think you could get some nice speed gains if you can break down your actions into separate groups of invoice numbers. I don’t know how the invoice numbers run but you may be able to select a reasonable range of invoice numbers in both databases and then in the formulafill you would use lookupselected( instead of lookup( so the searches are confined to a much smaller target than having each lookup going through the entire database. You would then use a loop to section each group for processing.

local lowInvoice,highInvoice
highInvoice=50,000
lowInvoice=1
loop
    setactivedatabase "Sales-Proto-Tek"
    select «Invoice No.» >= lowInvoice and «Invoice No.» < lowInvoice+5000
    setactivedatabase ""
    select «Invoice No.» >= lowInvoice and «Invoice No.» < lowInvoice+5000
    Field PayDate
    formulafill lookupselected("Sales-Proto-Tek","Invoice No.",«Invoice No.»,«Payment 	Date»,"")
    Field TrackNo
    formulafill lookupselected("Sales-Proto-Tek","Invoice No.",«Invoice No.»,Remarks,"")
    lowInvoice=lowInvoice+5000
until lowInvoice > highInvoice

If your Invoice No. field is text you will have to use the val( function as well for the select statements. This is just a roughly thrown together example and totally untested. Just offering another possible path to investigate. This might be similar to the approach Michael Kellock’s Accelerated Lookup procedure takes.

staylor246 · December 1, 2017, 10:52pm

These are all great (and new to me) ideas. I will try them on some data and report my findings. Bound to be faster than what I had. Thanks to all.

JohnB · December 2, 2017, 5:41am

If you are transferring multiple fields based on the same keyValues and keyData and are either transferring from the full database, or from the same selection of it, you can combine multiple field lookups between databases into one as a text array at the sending database, lookup just the combined field using any suitable form of lookup, then assigned the combined data to its proper destinations with formula fills at the receiving database. Thus trading slow cross-database lookups for faster formulafills within the same databases. At the price of more complex code to maintain and of temporarily jumbled data. Which should be bigger price than an inexperienced Panorama programmer would want to pay! But if you can keep it all straight in your head, and in your code, it can save significant time transferring multiple fields.

Say you want to transfer Field1, Field2 and Field3 from the sending database respectively to FieldA, FieldB and FieldC at the receiving database based on matching Field keyValue at the former to Field keyData at the latter. First setup the sending database with:
Field1
FormulaFill Field1+Sep+Field2+Sep+Field3 ; where Sep is any suitable separator for your data.

Then use FormulaFill lookup(, superlookup(, or any suitable variation thereof, to lookup the now combined Field1 into FieldA of the receiving database.

Then distribute the results in the receiving database:
FieldB
FormulaFill array(FieldA,2,Sep)
FieldC
FormulaFill array(FieldB,3,Sep)
FieldA
FormulaFill array(FieldA,1,Sep) ; saving the combined data field for last.

Finally return to the sending database to restore it with
Field1
FormulaFill Array(Field1,1,Sep)

Alternatively you could add temporary fields to each database to hold the combined text arrays and delete them both at the end.

This is the most straightforward when all the fields involved are text fields, but if you can convert the data, to text and back, accurately, it can be used for other datatypes, even multiple types at once. It can be done on just subsets of either or both databases with sufficient care to not leave jumbled data at the end. Fields can change over the course of a program so you have to be careful when in your program the transfer is done. I’ve made most of those goofs along my way but have saved significant time.

This is probably more complicated than you’ll want to use, but it is another strategy for speeding up the transfer of multiple data fields and some here may find it helpful.

gaseous1 · April 19, 2019, 7:28pm

I can attest to the speed of John B’s strategy, especially when using superlookup and sorting both databases identically.

My workflow requires using data from successive records in a calculation. This is a slow process when a loop is used.

It is much quicker to use arraybuild (or arrayselectedbuild) to collect the fields that are pertinent to the calculation. In my case, I stagger the resulting arrays using arrayinsert, arraytrim and arraydelete. I then use arraymerge to paste the individual arrays together. This merged array has data from each of the records I require, all in one “record.” (or “row”, if you prefer that term).

Then, I use newdatabase to create a temporary database and import the merged array into that database, where I can use formulafill to make my calculation(s).

Finally, in the original database, I use formulafill with a superlookup formula, to populate the field containing the result of the calculation.

On my 2015 MacBook 13" dual-care i7 3.1GHz 16Gb machine, the superlookup takes 29 seconds to fill 72000 records. The entire process completes in under 40 seconds. YMMV.