Create New DB from a variable?

JohnB · September 6, 2017, 8:53pm

Maybe I’m overlooking it, but I don’t see a way to, directly, create a new database from text already in a variable instead of from a text file on disk. Sometimes those text files come with problems. PanX handles them much better than Pan6, but doesn’t handle quite everything my text file creators can dream up. By loading the raw file to a variable I could validate it or rapidly locate any errors, which in some cases are most efficiently fixed at the source. Conceivably I do some pre-import processing with PanX’s text and array tools. Rather than re-open the original from disk to import, or saving a processed version to disk to then re-open, it should be possible, and faster, to pass the variable to some ‘create new db’ routine. Is it?

dave · September 6, 2017, 9:09pm

From the documentation for importtext.

If the imported text contains more columns than the database, the importtext statement normally adds the extra fields needed to the database automatically. This can be disabled, see the addfields option below. The new fields are named alphabetically: A, B, C, …, AA, AB, etc. This makes it very easy to create a new database with imported data. This example will create a new database and fill it with seismic data – automatically creating whatever fields are necessary.

newdatabase
importtext fileload(“SeismicData.csv”)

If the text is in a variable, you would substitute the variable for the fileload( function in the example.

admin · September 6, 2017, 9:19pm

As I was reading John’s question, Dave’s answer magically appeared in the browser! It’s a good answer, but we can go even further. The key, as Dave pointed out, is that the importtext statement uses a formula to access the data to import, so this text can come from anywhere.

Let’s say you have a variable named fileData that contains tab separated data, and that the first line in this text contains the field names. This code will create a new database with this text and set up the field names.

local fieldNames
fieldNames = firstline(fileData)
fileData = arrayrange(fileData,2,-1,cr())
newdatabase
importtext fileData,{ExistingData},{Replace}
setfieldnames fieldNames
autoallfieldwidths

This code was extracted from the newdatabasewithtextfieldnames statement. You can use Open View to look at the original code of this statement.

JohnB · September 6, 2017, 10:21pm

Thanks! I’ve used import text but my brain was stuck at its Help’s first line “…into an existing database.” I’ve used some of the options, but had forgotten about combining it with newdatabase, thus providing an “existing one.”

The more I play with PanX the more will stick with me. I’d also forgotten the Open View trick although in hindsight I must have seen it mentioned in forum. I’d figured newdatabasewithtextfieldnames probably was built with PanX code and was curious how to use its tricks in other contexts. I’d found setfieldnames earlier today, but was still searching for the arrayrange way to remove the first line.

dave · September 7, 2017, 1:30am

One subtlety that might be missed by those who don’t look at the original code, is the way fileData obtains its value.

fileData = lftocr(binarytotext(fileload(textPath)))

The lftocr( function is necessary, because the firstline( function assumes the lines are separated by carriage returns. The lftocr( function will ensure that that is the case.