Crashes, empty rows, and locked records

Here is some anecdotal information; I don’t expect any response necessarily. Crashes continue to be the biggest concern.

One user reported a crash this morning when running a procedure, followed by a series of crashes. I attempted the same procedure on my computer and got repeated crashes.

Then I looked and found one empty row on the server that was locked.

The procedure is used to transfer data from the firm’s billing system to a PanX database. The billing system will output an invoice in a so-called LEDES format, a standard legal billing protocol that exports billing data to a pipe delimited text file. Dragging and dropping one of these text files initiates the procedure, which extracts the data from the text file, then adds multiple records to the PanX database. I am highly confident that the crash occurs after the procedures starts adding records. When the procedure crashes, like today, it leaves behind one locked empty record, which I suspect causes repeated crashes when one tries the same procedure again without deleting or unlocking the record.

What caused the first crash, I have no idea. I have studied the procedure carefully several times and cannot find any issues. I have tweaked the procedure to try to make sure that no write to disk activity or screen changes occur until the data–processing, record–adding steps are complete. I developed a hypothesis earlier that disk or screen drawing activities can interact poorly with ongoing data processing activities.

The frustrating part is not have any tools to diagnose what is going wrong. We have in the past had many crash problems with this database, but it has been working without a crash for weeks until today. It is used a lot and has over 70,000 records. (We archive the oldest records after the first of the year to control the size.) So it has had a very good stretch of working without a problem. But going through this kind of turmoil even once a month is not acceptable.

The procedure has over 400 lines, so I ran portions of it and stopped, and added sections of the procedure and re-ran it until I got all the way through without any crashes. Since then, it has been working without a problem on my computer and the user’s computer. Also not sure why it starts working again; was it just deleting the locked empty record, or stepping through chunks of the procedure? I look forward the day when crashes are rare.

I am sending this since it might trigger some ideas in Jim, but maybe not.

No immediate ideas, but I’ll let it percolate.

This is where the new logging tools are very useful. If you put zlog statements in the code, you could enable logging and find out exactly where the crash is happening. If you haven’t figured out how zlog works yet, I’m going to cover that very early in the classes – it may even be the very first class (because of the usefulness in tracking down other problems.)