XML Tag Parsing


#1

I’ve got Panorama parsing a sizable XML form very reliably in nearly all instances. Typically the tags are clear and easy to pull apart using TagData: TagData(XML,"<head>","</head>",1)

The exception is when extra notes or parameters have been added inside the opening tag. For instance: <head typecode="123" ref="321">.

I can alter my filter to TagData(XML,"<head","</head>",1) and catch them, then find the > and take it from there.

It just feels clumsy though, especially since it involves hundreds of different tags. Is there some trick I’m missing that can see the opening tag as a whole?


#2

Since this is “Classic” and regular expressions are not an option, I doubt if there is much you can do to improve on what you are doing. Tagdata( doesn’t have any provision for wild cards, so two nested functions is probably the best you can do.

Something like

tagdata(tagdata(XML,"<head","</head>",1)+"</head>",">","</head>",1)

should be enough to deal with any exceptional cases that come up.


#3

I think Dave meant to write:

tagdata(tagdata(XML,"<head","</head>",1)+"</head>",">","</head>",1)


#4

A clever solution. I had been planning on using a text funnel. Since I have hundreds of lines of these formulas I’ll run a couple through a loop to see if one or another method is faster.


#5

And another variation to test:

array(tagdata(XML,"<head","</head>",1),2,">")

#6

Yes. That’s what I meant to write. I’ve edited the original.


#7

This should work if you don’t have another set of tags nested within.


#8

Nested tags are present by the hundreds in some cases.

FWIW, I’ve found tagdata(XML,"<head","</head>",1)[">",-1][2,-1] processes the whole file the fastest - and is simpler for me to read.

My dream had been to be informed about a wild card match that I was unaware of, such as: tagdata(XML,"<head**>","</head>",1)

Oh well, thanks for the input Dave and Gary.