Pulling Data from a Web Page - How?

There are still Panorama X and Beta 10.2 posting categories. At this time should we still be considering PanX to be Beta and use both categories? Or should all PanX questions start migrating to just one Category?

Towards the end of the 50 minute Web Browser video, Jim mentioned a Pan6 routine for pulling stock quotes and said he’d cover that “pulling data from the web” later. But I didn’t see any later video training titles that seemed to cover it - unless it’s in a “generic”, “Short Programming Topics”.

I was able to get the web page to display by putting a web object on the form, changing the type to Formula, and putting the URL of my destination in quotes in the formula window.

Amazingly, putting that final page URL path in, bypassed the requirement for a username/password to be entered that I’d incurred if I manually went to the first page and worked my way, vie menu choices, the target page. So that’s cool. I can go direct to my target page.

Now I’m seeing the text I want in the form web object with just
https://www.MyTarget.org/members/MyTargetFile.txt
in the formula window.

Eventually, I need to “calculate” the file name “MyTargetFile.txt” by including changing dates - like “MA2022Something.txt”, and "MJ2022Something.txt, etc.

If the text I want is displayed in the web object, I’d think I just need to assign that to a variable for further processing?

I’m sure there is some “Load” type function that will get it, But I’m unsure where to put it. Like if I have ThisData = SomeCommand, it seems I’ve moved from function to procedure.

I see
displaydata loadurl(“http://www.apple.com”)
in the documentation.
But it seems to me, if it’s going to put content in displaydata, that 1) displaydata needs to be defined, and there should be an = sign. Something like
Local displaydata
displaydata = loadurl(“http://www.apple.com”)

If I can just pull the content, I don’t have to see the webpage at all (via a web object). It’s just kinda neat to see it there - for now. If I can get that content in a variable, I don’t need to see the actual page anymore.

Maybe I can have a GetTheGold procedure that pulls it into a variable and I don’t even need the form that has the web object. It’s 1 AM and I don’t want to have all the fun in one day. I can struggle with it, but if there are any examples around, please point me to them.

That’s a good question. I don’t have a particularly good answer. The 10.2 release is still considered a public beta. But I think you could post in either category and people will find the question.

In this case, your questions really don’t have anything to do with 10.2, so probably this would have been better in just the regular Panorama X category. But I’m only mentioning it because you specifically asked, otherwise I wouldn’t have even noticed.

I can’t remember if there is a video about this or not. But pulling data from the web is well documented. You don’t need to use a web browser at all. The workhorse this task is the url( function.

There’s also the urltask( function, which downloads from the web in the background. There are older loadurl( and posturl( functions, but don’t use them, their functionality is completely superceded by the newer url( function.

That’s going to depend on what version of macOS you are using. On older versions of macOS, Panorama shares state with the main Safari browser on your system. So if you had logged on using Safari, you wouldn’t have to log in again when using the browser in Panorama.

On newer versions of macOS, this isn’t true anymore. Each application maintains it’s own web state (cookies, etc.). So on those newer systems, you would have to log in again when using Panorama. However, once you logged in, you would stay logged in if the web site is designed for that. It’s just like if you used Safari vs. Chrome vs. Firefox.

When I say “newer” versionf of macOS, it’s actually been quite a few years since that change was made, but I don’t remember the exact year.

If you want to access the text, you probably want to use the url( function instead of a web browser object. It’s not impossible to access the text in the web browser object, but it’s not simple either – you have to write some JavaScript to do it.

Exactly. The url( function does exactly what you want.

Displaydata is not a variable but a statement (something like a Message statement on steroids).

I take it that you didn’t try it. DisplayData loads a value into a large dialog window, just as it’s described.

You’re not trying to make DisplayData become equal to anything.

Consider other similar Statements and their lack of needing an = sign: Alert, Message, BigMessage, GrowlMessage. Or other Statements: ArrayBuild, FormulaFill, SaveAs…

Thank you for the responses. I’ll go over them.

I see now that displaydata is a command.

Because URL was a function, I figured it belonged on the right side of an equal sign. But now I see it’s a “parameter” for the displaydata command. In my very old-fashioned style, I’m not used to just using the function that way rather than assigning it to an intermediate variable. When you are hunting down a problem, it’s easier to look at smaller steps. I remember having to debug someone’s VB code where they had a variable as a function parameter, and the function was a parameter in a subroutine call. The error could have been in the variable, the function, or the subroutine. It looks cool to do a lot with a little - but not so much fun when tracing down an error.

The password requirement does need more investigation. I did log in, via Safari, during my attempts with the web object. and when I quite PanX, there was a 433 message about adding something to my keychain.

So pulling that data.txt off the webpage is still a work in progress.

Cool beans! it works great and the user option parameter handled the user/password request. Without it, I was asked to provide log-in credentials. It’s only fair.

On to the next step. :grinning:

I have scraped data from web pages. However, some web pages are difficult or impossible to decipher, and they are always subject to change. That is what has happened to the ZipInfoPlus statement, which no longer works. I have replaced it with my own version that requires a subscription to an API, which is free for the small amount of usage I have for it, and would probably be worth the subscription cost for someone who needs it for intermediate usage. More than that, and it would be worth making sense of the US Postal Service API, which is free, but the instructions are beyond my pay grade.

I would post my subscription method, but I have not been sufficiently motivated to write the last piece of it, which would be to check if the API key is present, and if it is not, direct the user to the proper website to obtain one, and then ask for the key when you return to Panorama. (Or it might be possible within Panorama’s web rendering system, but again, that is beyond my pay grade, which is $0.00/hr!)