Database frequent crashes finally resolved

CooperT · May 17, 2020, 3:40pm

Starting yesterday, a server-connected database, which had been quite stable, began crashing frequently. Certain actions would consistently lead to a crash, although these very same actions had been taken hundreds of times in the past. For example, the database opened to a form without touching any data. Choosing a Show List option would run a procedure that built an array from a subset of the data, then open a form with a text list where the array was displayed. That worked. The text list had a search box whose attached procedure recalculated the array when return was pressed. The building of the new array appeared to work, and the procedure finished, the new array was displayed, and perhaps 10 seconds later, Panorama quit. This happened consistently every time. Several other commonly used procedures also ended with crashes.
Solution: do not pull your hair out or set it on fire. After pursuing numerous other ideas, I detached and deleted the server database, returning the database to stand-alone status. Magic: everything worked with no crashes.
Then Upload the database to the server and where it continued to work perfectly so far, including all the steps that had been crashing.
I have heard Jim refer to data corruption from time to time as a possible culprit; I wonder if the server data had become corrupted. Why did it finish a procedure, then crash after a 10-15 seconds? Going through the steps to fix it was not such a big deal, but examining and reexamining every step of the code looking for potential causes was a big deal. I’m thinking: what mistake have I made? Is there some code that works with stand-along DBs but not once it is connected to the serve? Is my local data corrupted? Of course, there are some statements not available when the DB is connected, but those few times I have encountered therm have not caused a crash before and have been handled without a problem.
Any comments from anyone about what might have happened?

admin · May 17, 2020, 9:27pm

I’m not sure I would have thought of detaching and then re-uploading the database. I have no idea why that would stop this crashing problem.

A shared database works just the same on the client as a single user database does. The only difference is that for some operations, the client will communicate with the server during the operation. For example, when you edit a record, part of the process is that the client software will ask the server to lock the record.

In between such actions, there is no communication between the client and the server. If you are not performing an action on the client, the client is not communicating with the server.

I guess one sort of exception to this is the auto-unlock timeout. If you have an auto-unlock timeout set, then when the timeout expires, the client will unlock the record, and it does communicate with the server to do that. Any chance you had an auto-unlock set to 10-15 seconds?

The other exception is the Server Admin wizard. If that is open, and depending on how you have configured it, it will contact the server on a regular basis. Normally this only happens when the Server Admin wizard is open, but if you have Background Refresh enabled, it will do it all the time. I’ve never noticed crashes but I’ve noticed that occasionally with Background Refresh enabled it can cause errors in operations. Maybe I should remove that option or put a warning on it. I would not recommend leaving the Background Refresh option on full time while you are performing other operations.

FYI, all communication is initiated by the client. The server will never initiate a communication, in fact, it has no way to do so. The server only responds to communication from the client. So if the client is just sitting there doing nothing, the server is also going to be doing nothing. The two exceptions are auto-timeout and the server admin wizard, as noted above.

At some point I would like to figure out how to get the server to initiate a communication with a client, i.e. a push notification. For example, if you want to do a new generation, it would be nice if the server could send a notice to all users that had the database open to request that they close the database. Right now that is not possible, because the server has no way to initiate a communication. This is a hard problem, though obviously not impossible as many programs do this. But I’ve looked into this and it is a very major undertaking, so it will have to be saved for some future time.

admin · May 17, 2020, 10:37pm

Thinking about this further, auto-unlock seems more likely than Server Admin as a potential culprit. Auto-unlock could conceivable be triggered by your procedure, i.e., your procedure could be locking a record, then auto-unlock kicks in later. Of course I have no idea why auto-unlock would cause a crash. If Server Admin was the problem I would think it would happen totally randomly, not connected to running a procedure.

So do you have auto-unlock enabled? And is it set anywhere near the 10-15 second interval you were seeing the crashes? Though your description of the procedure doesn’t sound like it would lock a record.

CooperT · May 18, 2020, 1:35am

One more fact: following each crash, the server showed one locked record for that database, which I had to unlock upon restarting. So I spent a lot of time looking to see why the record would have been locked; I did not find any code that would have locked a record. In the example, no data was modified, added, or deleted.
Yes, the database was set to auto unlock records, probably set for 20 seconds. I did not specifically pay attention to whether Server admin was open, but I certainly had t open it many times to see the state of things.
I admit to having no understanding of what was happening, but it stopped upon discarding the server database and uploading a new one from the client.

admin · May 18, 2020, 8:50am

It sure sounds like maybe it crashed in the process of trying to unlock a record. I have no idea why, but that gives me an idea of what to watch out for. I was actually doing some testing on that code earlier today, it worked at that time for me.

I’ve got a substantial part of my logging code in place. I’ve been going thru the code running experiments to make sure the logging code is working and that I didn’t break the actual code in the process of all these changes. So far, I haven’t. This is why I was testing the auto-timeout code working. It’s very reassuring to see all the logging showing that things are working as they should. Though I did find and fix one bug that I saw in the logs already, that could cause it to fail to log off the server correctly when the last database closes. I think this is a bug you have encountered and I wasn’t even looking for it. So this logging project is already paying off. It’s a monster project though, really a lot of work. Which I expected, but even moreso than I expected. Hopefully the payoff will be quick and continue long into the future.

CooperT · May 18, 2020, 4:32pm

I went through my procedure that was crashing, or crashing occurred shortly afterwards. The only statement that even raises a question was this:

Find info("serverrecordid")=val(lvfileno)

I didn’t think this statement would contact the server. If not, then I am pretty sure that nothing in the procedure itself is contacting the server. Why, then is a record be locked after the crash?

admin · May 18, 2020, 9:08pm

You are correct, that statement will not contact the server or lock a record. With the information I have, I cannot answer the question as to why a record is locked.

Since you know exactly what procedure the problem occurred in, once you get the next beta, you will be able to turn on the logging and it will tell you exactly what communication is happening with the server. I guess I’m starting to sound like a broken record on this logging thing, which sucks, especially since I can’t deliver yet.

Just to give you an idea of what this will be, here is a log I recorded yesterday of opening and synchronizing a database, and then closing it. As you can see, it includes a lot of detail.

CLIENT OPEN DATABASE ==================================================
[CLIENTOPENSHAREDDATABASE] targetDatabase: Smallish Mailing List
[CLIENTOPENSHAREDDATABASE] Start connection.
[CLIENTOPENSHAREDDATABASE] Client logon.
CLIENT LOGON ==================================================
[CLIENTLOGON] realURL: Jim's Test Panorama X Server
[CLIENTLOGON] Existing sessionID: 
[CLIENTLOGON] Logon to Jim's Test Panorama X Server
[SERVERQUERY] REQUEST LOGON FROM Jim’s MacBook Pro
SERVER LOGON ==================================================
[_serverLOGON] macid: 784f43885521
[_serverLOGON] oldSessionNumber: 
[_serverLOGON] sessionNumber: 0
[_serverLOGON] sessionID: 2
[_serverLOGON] EnterpriseNextSessionID: 3
[_serverLOGON] EnterpriseSessionList: ID=2  ACCOUNT=7ab907c6a7762129125d SN=784f43885521 user="Jim Rea" computer="Jim’s MacBook Pro" logon="3672584758" 
[_serverLOGON] Session 2 started: Jim Rea (Jim’s MacBook Pro)
[CLIENTLOGON] === DICTIONARY xreply ============
[CLIENTLOGON] STATUS="OK"
[CLIENTLOGON] SESSIONID="2"
[CLIENTLOGON] LOOPBACK=""
[CLIENTLOGON] === END OF DICTIONARY xreply ============
[CLIENTLOGON] === DICTIONARY openSessions ============
[CLIENTLOGON] Mini Panorama X Server="5"
[CLIENTLOGON] Jim's Test Panorama X Server="2"
[CLIENTLOGON] === END OF DICTIONARY openSessions ============
[CLIENTLOGON] === DICTIONARY _EnterpriseURLs ============
[CLIENTLOGON] Jim's Test Panorama X Server="2"
[CLIENTLOGON] === END OF DICTIONARY _EnterpriseURLs ============
[CLIENTLOGON] Logon complete. sessionID: 2
[CLIENTOPENSHAREDDATABASE] hostServer: Jim's Test Panorama X Server
[CLIENTOPENSHAREDDATABASE] hostDatabaseName: Smallish Mailing List
[CLIENTOPENSHAREDDATABASE] hostDatabaseID: 1A50320C-371C-46D1-B116-6FFDFE72C666-98075-00022D09FC57C3C8+3672575426
[CLIENTOPENSHAREDDATABASE] sessionID: 2
[SERVERQUERY] REQUEST OPENSHAREDDATABASE FROM Jim’s MacBook Pro
SERVER OPEN SHARED DATABASE ==================================================
[_serverOPENSHAREDDATABASE] serverDatabaseID: 1A50320C-371C-46D1-B116-6FFDFE72C666-98075-00022D09FC57C3C8+3672575426
[_serverOPENSHAREDDATABASE] clientDatabaseID: 1A50320C-371C-46D1-B116-6FFDFE72C666-98075-00022D09FC57C3C8+3672575426
[_serverOPENSHAREDDATABASE] Open Shared Database:Smallish Mailing List
[CLIENTOPENSHAREDDATABASE] === DICTIONARY xreply ============
[CLIENTOPENSHAREDDATABASE] STATUS="OK"
[CLIENTOPENSHAREDDATABASE] LOOPBACK=""
[CLIENTOPENSHAREDDATABASE] === END OF DICTIONARY xreply ============
[CLIENTMODE] mode: CONNECTED
[CLIENTOPENSHAREDDATABASE] Connected - sessionFiles: Smallish Mailing List
CLIENT SYNC ==================================================
[CLIENTSYNC] hostDatabaseName: Smallish Mailing List on hostServer: Jim's Test Panorama X Server
[CLIENTSYNC] sessionID: 2
[CLIENTSYNC] lastSyncTS: 13
[SERVERQUERY] REQUEST SYNCHRONIZEDATA FROM Jim’s MacBook Pro
SERVER SYNCHRONIZE DATA ==================================================
[_serverSYNCHRONIZEDATA] actualDatabaseName: Smallish Mailing List
[_serverSYNCHRONIZEDATA] syncOption: RECENT
[_serverSYNCHRONIZEDATA] lastSyncTS: 13
[_serverSYNCHRONIZEDATA] New Data: 0 bytes
[_serverSYNCHRONIZEDATA] No new data to sync.
[CLIENTSYNC] === DICTIONARY xreply ============
[CLIENTSYNC] STATUS="OK"
[CLIENTSYNC] DELETEDRECORDS=""
[CLIENTSYNC] SYNCTIMESTAMP=1
[CLIENTSYNC] LOOPBACK=""
[CLIENTSYNC] === END OF DICTIONARY xreply ============
[CLIENTSYNC] tempFileID: 
[CLIENTSYNC] Client is already synchronized.
CLIENT CLOSE DATABASE ==================================================
[CLIENTCLOSESHAREDDATABASE] clientDatabaseName: Smallish Mailing List
[CLIENTCLOSESHAREDDATABASE] hostServer: Jim's Test Panorama X Server
[CLIENTCLOSESHAREDDATABASE] hostDatabaseName: Smallish Mailing List
[CLIENTCLOSESHAREDDATABASE] sessionID: 2
[SERVERQUERY] REQUEST CLOSESHAREDDATABASE FROM Jim’s MacBook Pro
SERVER CLOSE SHARED DATABASE ==================================================
[_serverCLOSESHAREDDATABASE] closing db:Smallish Mailing List in session:2
[_serverCLOSESHAREDDATABASE] closefile:Smallish Mailing List
[_serverCLOSESHAREDDATABASE] CLOSE THE FILE NOW
[CLIENTCLOSESHAREDDATABASE] === DICTIONARY xreply ============
[CLIENTCLOSESHAREDDATABASE] STATUS="OK"
[CLIENTCLOSESHAREDDATABASE] LOOPBACK=""
[CLIENTCLOSESHAREDDATABASE] === END OF DICTIONARY xreply ============
[CLIENTCLOSESHAREDDATABASE] BEFORE sessionFiles: Smallish Mailing List
[CLIENTCLOSESHAREDDATABASE] SEARCH: 1
[CLIENTCLOSESHAREDDATABASE] AFTER sessionFiles: 
[CLIENTMODE] mode: DISCONNECTED
CLIENT LOGOFF ==================================================
[CLIENTLOGOFF] sessionFiles: 
[CLIENTLOGOFF] Request logoff from server. realURL: Jim's Test Panorama X Server
[SERVERQUERY] REQUEST LOGOFF FROM Jim’s MacBook Pro
SERVER LOGOFF ==================================================
[_serverLOGOFF] macid: 784f43885521
[_serverLOGOFF] sessionNumber: 1
[_serverLOGOFF] sessionInfo: ID=2  ACCOUNT=7ab907c6a7762129125d SN=784f43885521 user="Jim Rea" computer="Jim’s MacBook Pro" logon="3672584758" 
[_serverLOGOFF] EnterpriseSessionList: 
[_serverLOGOFF] Session 2 ended.
[CLIENTLOGOFF] === DICTIONARY xreply ============
[CLIENTLOGOFF] STATUS="OK"
[CLIENTLOGOFF] INFO="Jim Rea logged off."
[CLIENTLOGOFF] LOOPBACK=""
[CLIENTLOGOFF] === END OF DICTIONARY xreply ============
[CLIENTLOGOFF] === DICTIONARY openSessions ============
[CLIENTLOGOFF] Mini Panorama X Server="5"
[CLIENTLOGOFF] === END OF DICTIONARY openSessions ============
[CLIENTLOGOFF] === DICTIONARY _EnterpriseURLs ============
[CLIENTLOGOFF] 
[CLIENTLOGOFF] === END OF DICTIONARY _EnterpriseURLs ============
[CLIENTLOGOFF] Logoff complete.

CooperT · May 18, 2020, 9:42pm

That looks good. Kinda like watching the console! When I get it, I will try to reproduce the failures I reported earlier with the logging turned on.

admin · May 18, 2020, 11:10pm

Yes, like the console, but because normally this puts out TOO MUCH information, it is designed to be configurable on the fly.

And I’ll be able to send you a link to configure the log to just show information I am interested in to look at a certain problem.