How much Memory is enough?

I recently hit a “Not enough App Memory” message when running a lookup in PanX on the same Mini. The message uses the term “memory” and I only have 8GB RAM. But the warning mentions amounts in the 48GB range. So I’m guessing, processes are shifting work over to the SSD, and there wasn’t enough room on the SSD.

The raw effort was doing lookups for 82,000 records that were looking for matches in another file of 82,000 records. A unique matching field had to be created, by concatenating fields together, for both files. A sample “key” would be: CASHCASH04/01/85SAN JOSE HONDA/SONYSONY KV-4000 4"""" TVSONY KV-4000 4"""" TV.

How much RAM and Drive space would be required to do the 82,000 against 82,000 in one pass? I figure this depends on record size and such so I’m wondering about the “behind-the-scenes” processing. The files themselves were about 22MB and 30MB. When both were open, Total Date Usage was 52MB or 0.61%. Does running out of “memory” have something to do with the “Unlimited UnDo” feature?

I was able to move forward by breaking the files into chunks like selecting 3000 records in FileA and doing lookupselected on 10000 records in FileB.

If those numbers got much higher - say selecting 5000 records in FileA, I’d get an Apple - “Application ran out of Memory” error with a Force Quit dialog. Sometimes even that was locked up and I had to restart the computer.

For comparison, last month I ran a SimplyFortran program where I had to run an iteration of 3,600,000 numbers against another 3,600,000 numbers. The timing for one iteration of 3,600,000 was only a few hundredths of a second. Before I started, I multiplied that times 3,600,000 and it came to about 20 hours. I crossed my fingers, held my breath, turned in three circles, and pushed “Go”. The next day the process was done.

So in general, with PanX, when you are doing a whole of things against a whole lot of things, what do you have to be aware of so you don’t get, the “Ran out of App memory” / Force Quit message? Note that only PanX and the Finder were running at the time.

A lot. Are you doing this with a lookup in a formulafill? I would recommend setting up a relation and using a join instead. It will probably be over a thousand times faster and use 50,000 times less memory. Another benefit is you won’t have to set up a separate key field, you can set up multiple keys when you configure the relation.

In this real case, the content of one field in the 82,000 record database was accidentally deleted. Gone, capote, no recovery. In addition, 111 new records, with no way to separate them from the older records (no timestamp field, etc.) were added. And the database was sorted several times with no way to reset it in its original order.

You have an older copy - before the new records were added. It has the missing content that was deleted. How do you get that missing content back into the field in the new database? The databases are in different orders, one has extra records, and there is no common “Id” field to match them.

So, in each database, I concatenated several fields into one temporarily added “key” field with the hope and prayer it would create a unique identifier. Then I used that unique identifier field in a lookup (actually lookupselected because I couldn’t process all records at once) to bring over the missing content.

Maybe - in hindsight - I could have made the match field shorter and still had uniqueness. But sometimes things are like coming across a bleeding person - the first step is the stop the bleeding and act to do that - even though in retrospect there might have been more efficient ways of going about it.

After a few partial runs - making small enough selections in both databases so I didn’t lock up with “App ran out of memory” Force Quit messages - I remember things like lookup work best if the files are sorted on the matching field. And after a batch of matches was made, I could remove those records from the older file (a copy of the older file of course) - because they were no longer needed - making the “look into” pool smaller and smaller.

Waaaait a minute. Are you saying I could have created that unique field, then used it to Join the databases - getting the missing field that way instead of the lookups? I know you just said that. But I thought you were talking about a new situation where the records are already matched - like creating an invoice were the product and price are already matched with an item number.

But you were saying I could create my unique field and use that to Join. If that’s the case, for fun, I’ll try it out using the same artificial common field. It’s one thing to have the information from both files displayed on the screen, but the end result has to be an export combining the data from both files.

You must have “already matched” records if you were using lookup. If you can use lookup you can use join – except MUCH faster and with MUCH less memory.

I started reading every word of the Relational Database Management link but it’s a pretty big chew. I could skip over clairvoyance and other parts for this exercise. I see that with matching keys - and I can use the same formula I used “the old way” for that -I can bring over the missing field with a Left Join.

Maybe I’ll try a simple three or four field, 10 record, databases as a starter - just to get the mechanics down and understand the dialog windows.

This is like Wrath of Khan (Star Trek Movie reference) where Spock tells Kirk that he noticed, as Khan had been a prisoner on a planet for so long, he was used to maneuvering in 2 dimensions rather than 3 (as in space), And Khan’s “vision” limitation allowed Kirk to get the better of him.

Had they been dueling with data, Khan would be using Lookups instead of Left Joins. :grinning:

1 Like

I posted this report in another topic but it’s worthy of repeating. Using the Lookup, I had to break the 89,800 records into smaller batches and use other tricks to get the lookup to work on my 8GB machine. It took about 8 hrs because if my batch was too big, the system would choke and I’d have to restart.

After setting up a formula for keys and setting an import (shown how at 6 minutes 21 seconds into “Relational Databases Part 3” training video), filling the missing data took 7 seconds!

I’m guessing that had I added an extra field to both databases and pre-calculated a key (derived from a formula that concatenated several fields with the goal/hope they’d create uniqueness), the import of the missing data in one field would have been even faster (PanX unburdened with calculating keys on the fly).

1 Like