Import file from Aquaminds NoteTaker to Panorama Database

organicman · January 21, 2022, 11:50am

I have a very large file of over 5,000 poems in the outliner software, Aquaminds NoteTaker 4 for Mac. Each poem is on a separate page in NoteTaker. I can export the poems to a number of formats including plain text, OPML, NoteTaker’s own XML format called NTML, Microsoft Word doc, Comma-delimited, Tab-delimited, rich text, etc. I can export the file as one large output or as individual documents, and I can include or not include the page title in the export. The challenge is how to format the data for import into my Panorama database. Each poem exists in a single cell of its NoteTaker page. The page title is in most cases the title of the poem. So there are really just two “fields” which need to be generated. I am not a programmer at all so could use some help.

Here is a sample of one poem’s NTML output:

<page title="The Winds of Time" entryCt="1" ident="2282" exportedToServices="0" hasTab="0" prevPageIdent="2283" nextPageIdent="2281">

<heading><SPAN class="&pageStyle;">The Winds of Time</SPAN></heading>
<pageView pageNum="4271" backgroundColor="#fffcf6" showControls="1" showPageBreaks="1" showTagIcons="0" entryNumberingStyle="NO_ENTRY_NUMBERING" pageStyle="NO_STYLE" entryIdent="0" creationTime="545967653" backgroundImageStyle ="BGIMAGE_DEFAULT"/>

<entry ident="2282.1" timestamp="545967653" modificationTime="545967665" priority="0" state="0" isExpanded="0" isEditable="1" isHighlighted="0" isShowingAllLines="1" isPageBreak="0">
<cell><SPAN class="style3">THE WINDS OF TIME

The winds of time stream gently now,
But roared upon a when,
From fairy tale to nightmare&apos;s row,
They circle back again.

Kevin Johnson ©2018</SPAN>
</cell>
</entry>

</page>

---
As you can see, the Page title is contained here:
<page title="The Winds of Time" entryCt="1" ident="2282" exportedToServices="0" hasTab="0" prevPageIdent="2283" nextPageIdent="2281">

---

The poem body is contained here:

<cell><SPAN class="style3">THE WINDS OF TIME

The winds of time stream gently now,
But roared upon a when,
From fairy tale to nightmare's row,
They circle back again.

There is also a creation date for each poem in Unix dates format except that I noticed that the recorded dates are way off, like a decade, so they’re useless.

Could someone show me how to convert this XML data into two importable fields, or help with the approach I thought of, below?

An alternative I thought of would be to export the poems as individual text files, in which the first line of the text would be the page title and the rest of the lines would be the poem body. Perhaps a script could parse line 1 into a Title field and the rest of the lines into the Poem body field. Line breaks would, of course, need to be preserved. Then all the resulting data would need to be compiled as a single importable file so I could pull it into Panorama X.

It means a lot to me to get my writings into a more indexable, categorized, searchable form which is suited to curating my work for future publication. Looking forward to some assistance here.

Thanks!

JamesCook · January 21, 2022, 4:14pm

Doing a lot of reading between the lines, my inclination would be to go with either Tab-delimited or Comma-delimited text. Panorama X handles the commas very nicely.

You’re importing into a database with fields, so if you’re able to set that export up to separate, for instance, your title, poem, author, date and any other segments that matter, you’re in good shape to import the text far more easily than any of the other formats you’ve noted.

The only potential problem is what kind of line breaks are being embedded in the poems themselves. They may do just fine, so giving it a try would be worth it.

dave · January 21, 2022, 4:27pm

You can indent all of the code 4 spaces. A quick way to do that is to highlight the text you want to indent, and then click the preformatted text tool.

I took the liberty of doing that for you.

organicman · January 21, 2022, 5:45pm

Separating the items isn’t so simple, at least to me. The XML code does indicate page title as its own thing but how to translate that into a separate field in export I don’t know. I need to be able to strip out the XML code in the process, of course. Tab or comma delimited? Not sure how that will work either. Will have to experiment further.

dave · January 21, 2022, 7:57pm

You might want to check out the tagparameter(, tagdata(, and tagarray( functions.

In your example, you could extract the title with something like

Title = tagparameter(theText, "title=", 1)

or you could extract the heading with something like

Heading = tagdata(tagdata(theText,"<heading>", "</heading>",1), ">", "<", 1)

The inner tagdata( function extracts everything between “<header>” and “</header>” and then the outer tagdata( function extracts everything between “>” and “<” from the output of the first function.

There seems to be a bug in the display of that first link, but it does work to take you to the documentation for the tagparameter( function.

admin · January 21, 2022, 11:32pm

If it is possible for Aquaminds to export the data in this format, that would be the simplest format for importing. Panorama can easily extract the first line with the before( function, and the remaining lines with the after( function.

That’s not necessary and actually would be more difficult than importing each record directly. You can use the listfiles( function to get a list of all of the files in a folder. Then you can use the looparray statement to loop through each file. For each text file you would load the file into a variable, add a record to the database, then extract the components from the variable into the different fields in the database. In fact, this would be the approach I would recommend no matter what export format you are dealing with.

The necessary program is probably only ten or twenty lines. But you’ll need to learn about quite a few concepts to write those lines of code. If you already have basic programming experience and are familiar with concepts like variables and loops, you can probably learn how these concepts work in Panorama and write the code in 2 or 3 days. If you’ve never done any programming before, this is going to be a big task.

organicman · January 22, 2022, 1:25pm

Thanks and this sounds cool. But I have no programming experience. Since this would be short work for an experienced programmer, may I hire someone here to do it?

organicman · February 12, 2022, 3:33am

I would like to thanks the forum and especially TGCooper for help on this import project. After several iterations and tweaks we were able to import all but a few poems using the developed procedures. What a huge relief! BTW, I LOVE Panorama X. “Where have you been all my (Mac) life?”

admin · February 13, 2022, 11:40pm

I’m very happy to hear this – I was wondering if you had ever been able to make any progress on this. If you are willing, I would be curious to see the code that got the job done, either privately or here on the forum.

CooperT · February 14, 2022, 4:51am

The code was a real kludge. Various statements were used to extract two strings from the xml files, but there was a lot of variability in the original files. If Kevin wants to share it with you, I certainly don’t mind, but I don’t think you will find anything useful. The tagparameter statement that Dave suggested had limited usefulness, because there tags not in consistent locations or consistently present.
PS I don’t know if regular expressions would have worked, but I have never learned how to use them.

admin · February 15, 2022, 5:00am

You can learn a lot about regular expressions in a short time. the basics are really pretty straightforward. I think a lot of videos and articles make it seem more complicated than it really is. I did a video about them a couple of years ago that I think will get you up and running in an hour. It’s in the Panorama Video training window (in the Help menu) and also at this URL:

Introduction to Regular Expressions.mp4 on Vimeo