Importing hospital price transparency JSON files into Pan X

does anyone have experience with or been successful at importing MRF JSON files into Panorama X?

hospitals are required to provide price transparency files that display costs of services (procedures, drugs, etc). the files come in several formats. many MRFs (machine readable files) are in JSON. I can open the JSON file using text edit, but cannot import the files into Pan X

here is a snippet of code:
{
“description”: “NEEDLE ASPIRATION -WANG”,
“code_information”: [
{
“code”: “L103001”,
“type”: “CDM”
},
{
“code”: “272”,
“type”: “RC”
}
],
“standard_charges”: [
{
“minimum”: 32.6,
“maximum”: 146.7,
“setting”: “outpatient”,
“payers_information”: [
{
“payer_name”: “CIGNA”,
“plan_name”: “Cigna All Commercial Plans”,
“standard_charge_dollar”: 87.53,
“estimated_amount”: 87.53,
“methodology”: “percent of total billed charges”,
“standard_charge_percentage”: 53.7
}
],
“gross_charge”: 163.0,
“discounted_cash”: 81.5
}
]
},
{
“description”: “NEEDLE ASPIRATION -WANG”,
“code_information”: [

There are multiple problems with the sample code.

Most importantly, the file snippet is not a properly formatted json file. More accurately, the file is a snippet or portion of a complete file.

ie. Imagine a paragraph of text with the first 8 characters, and the last 10 characters missing. That is essentially what you have.

The problem with JSON files is: They are often surrounded by additional text. See this forum thread with information from Jim Rea.

Basically the part that Panorama X can work with is the array between straight brackets “[” and “]”. You need to cut away surrounding text e.g.:

{
     "items":

and the curly bracket following the JSON data.

The screenshots show the QuickLook preview of an unmodified file and the contained pure JSON data.


hello

will review the info from J Rea.

regrettably these files are (intentionally) complex, to dissuade use
only put a snippet because the file is 774MB

Screenshot 2025-03-26 at 8.49.39 AM

these files should be for every hospital and can be JSON, or sometimes other delimited file structures

https://www.cityofhope.org/patients/our-value-and-pricing/pricing-information

so the question is how much editing of the downloaded JSON file is required to get to what PanX can deal with?

You are 100% on target in that there is a disincentive for the creator of these files to have them usable, accurate, or technically correct. I’ve worked on these exact same issues for over 20 years and the problem will never change. The creator can say that they have complied, and when it just does not work, who is going to hold their feet to the fire to fix it? There is zero accounability.

Panorama is the best tool in your toolbox to unscramble the eggs when and if they can be deciphered.

The only way that I have survived with this type of problem is to take it on as a challenge. Kinda like those who like Chinese block puzzles. It is not that it can not be solved, but rather how much interest or time are you willing to put into it.

The complete JSON file is too large (775MB) to upload to an online JSON syntax checker, and I was suspicious because the outermost structure is not an array ([...]). However, using jsonimport( it imports into a Panorama variable without an error, although on my M1 Max Mac that took 9½ minutes! However, judging by my brief attempts the resulting structure seems to be too large/complex for Panorama to manipulate reliably without crashing.

The bigger problem is that without knowing the structure of the file you are unlikely to be able to do much with it. A structure of successively nested further structures — arrays of indeterminate size, dictionaries with varying keys, text items and numbers — won’t necessarily map onto one or more row-and-column Panorama databases. Panorama is a powerful tool for storing and analysing heirarchically structured data, but I have found it necessary to analyse the detail of that structure very carefully, with all the possible variants, before deciding how to store it.

A CSV or TSV file might not be as efficient but can always be imported directly into a single Panorama database. And it might be that, fundamentally, this is just such flat-file data that has been encoded as JSON to make it harder to play with except by those who have tools designed for the specific structure of this data set.

1 Like

I am not fully appreciating what is being said here. IMHO, the file should not be expected to load into a variable.

The design of a JSON file is not limited to being imported into a fixed or predetermined db. Panorama is capable of allowing the JSON file to add new fields as needed. There is no necessity for the data to be in any particular order for Panorama to make use of the data. There is no need to know the ‘structure’ of the file. This is part of what makes JSON files to be cross platform and so useful.

With a .csv file, the comments are appropriate. With a JSON file, they are not appropriate. JSON files are designed to make it easier to work with. The issue of ‘friendlyness’ is that the designer of the file has no incentive to have it error free. They really do not want to share what they are legally obligated to share but by creating something, regardless of its technical accuracy, relieves them of their ‘legal’ obligation to have shared the info.

In these particular files and their use, think of 10 people, each separately given 20 pieces of a puzzle, with each piece representing the shape and size of a state of the USA. Each person puts their pieces together as best they can arrange with those pieces that can interconect. Then each person reports what they figured out. In JSON, each persons submission is combined with every other person’s contributions. Every person will have parts that will agree with others and parts that will be unique to others. But together they all play a part in theoretically accomplishing a complete, accurate puzzle. Even though each submission does not consist of the same puzzle pieces, they are useful when combined together with the other submissions. You do not have to know that there are 50 pieces to be able to submit your information. Each person works with what they have and the collaborator pulls all the records together to hopefully have a clue as to how they all fit together and perhaps be able to make the conclusion that Alaska & Hawaii do not interconnect with any other parts of the puzzle. It is all a game of interconnecting parts.

In these particular records from a hospital, the PA may when examining find shortness of breath, and lightheadedness. The caridiology department may find a leaking aortic valve. The records combine with identical Pt ID, name, birthdate etc, and each department’s contributions together tell the story without any particular department having to know what other departments are submitting. And the collaborator has no clue which departments will be contributing so all of those fields that are not submitted with data are not part of the db. Dermatology & Proctology may not have submitted any info so those fields are never created.

I could be wrong. I’ve been wrong before. I had thought I had made a mistake. I expect to make a mistake in the future. :wink:

If a JSON file contains an array of records, where each record consists of individual data items, Panorama can easily import it into a database. As Robert mentioned, Panorama can add additional fields if necessary.

If, however, the records contain further nested data structures, either arrays or records, the process is not so simple. If Panorama encounters that, it will put these structures into binary data fields, which means that you cannot see or work with this data directly, you must use functions to access the underlying data. The getstructurevalue( function is good for this, but to use this you must know the structure of the imported data.

In this situation loading into a variable is perhaps the best option. Then you can use getstructurevalue( function to pick apart the data.

So basically, Panorama is not magic. It sounds like this 774Mb of data should probably be split into multiple related databases. Panorama cannot do that on its own.

Given the size and complexity of this data, Panorama is probably not a good tool for this application. You will probably need a custom tool written in a high performance language like Swift, Objective-C, C++, etc. Maybe Python could work, not sure about that.

Beyond that, I think @pcnewble’s comments are quite on point.

Thanks.

I had tried to import the 774MB file into a database and had to quit Panorama because my machine ran out of memory, which is why I tried to import it into a variable to see whether it was a valid structure. As Jim says, just because it’s a structure that can be held in a variable or binary data field doesn’t necessarily mean it can be imported automatically into a database. If the JSON has been exported from another flat-file form of data storage— a spreadsheet or a single row-and-column database — it might well be suitable, otherwise quite likely not.

These three records and four fields (named A, B, C, D) in a database

1A	1B	1C	1D
2A	2B	2C
3A	3B	3C	3D

might be encoded in CSV as

"A","B","C","D"
"1A","1B","1C","1D"
"2A","2B","2C",""
"3A","3B","3C","3D"

However, there’s a lot to go wrong with CSV/TSV — whether or not the first line contains field names, whether the separator is tab, comma or semi-colon, whether text fields are correctly quoted when necessary, whether real numbers use decimal point or comma, how tabs and/or newlines within fields are handled, etc.

In JSON it might be encoded as an array of three arrays:

[
    [ "1A", "1B", "1C", "1D" ],
    [ "2A", "2B", "2C" ],
    [ "3A", "3B", "3C", "3D" ]
]

or, better, as an array of three dictionaries, in which the field names are the keys:

[
    {
      "A" : "1A",
      "B" : "1B",
      "C" : "1C",
      "D" : "1D"
    },
    {
      "A" : "2A",
      "B" : "2B",
      "C" : "2C"
    },
    {
      "A" : "3A",
      "B" : "3B",
      "C" : "3C",
      "D" : "3D"
    }
]

That’s the least compact but least error-prone too. It’s the format that Panorama uses to export a database and it seems to be what it expects for import into a database too.

But even simple real-world data can’t always be forced into that row-and-column form… Take this file structure, five levels deep:

1 (folder)
    1.1 (file)
    1.2 (folder)
        1.2.1 (file)
        1.2.2 (folder)
            1.2.2.1 (folder)
                1.2.2.1.1 (file)
                1.2.2.1.2 (file)
            1.2.2.2 (file)
            1.2.2.3 (file)
        1.2.3 (file)
    1.3 (file)
    1.4 (folder)
        1.4.1 (file)
        1.4.2 (folder)
            1.4.2.1 (file)
            1.4.2.2 (file)
            1.4.2.3 (file)
    1.5 (folder)
        1.5.1 (file)

i.e. a single folder containing five items, the second of which (1.2) is a folder containing three items, the second of which (1.2.2) is a folder, etc.

One way to represent that in JSON would be:

[
  {
    "type" : "folder", "name" : "1", "contents" :
      [
        { "type" : "file", "name" : "1.1" },
        {
          "type" : "folder", "name" : "1.2", "contents" :
            [
              { "type" : "file", "name" : "1.2.1" },
              {
                "type" : "folder", "name" : "1.2.2", "contents" :
                  [
                    {
                      "type" : "folder", "name" : "1.2.2.1", "contents" :
                        [
                          { "type" : "file", "name" : "1.2.2.1.1" },
                          { "type" : "file", "name" : "1.2.2.1.2" }
                        ]
                    },
                    { "type" : "file", "name" : "1.2.2.2" },
                    { "type" : "file", "name" : "1.2.2.3" }
                  ]              
              },
              { "type" : "file", "name" : "1.2.3" }
            ]
        },
        { "type" : "file", "name" : "1.3" },
        {
          "type" : "folder", "name" : "1.4", "contents" :
            [
              { "type" : "file", "name" : "1.4.1" },
              {
                "type" : "folder", "name" : "1.4.2", "contents" :
                  [
                    { "type" : "file", "name" : "1.4.2.1" },
                    { "type" : "file", "name" : "1.4.2.2" },
                    { "type" : "file", "name" : "1.4.2.3" }
                  ]
              }
            ]
        },
        {
          "type" : "folder", "name" : "1.5", "contents" :
            { "type" : "file", "name" : "1.5.1" }
        }
      ]  
  }
]

If I use jsonimport( to import that JSON as a structure in a variable or binary data field, in this case called JSON, I can access its heirarchy directly:

getstructurevalue(JSON,1,"contents",2,"contents",2,"contents",1,"contents",2,"name")
==> 1.2.2.1.2

But how could Panorama know how to import that into a single database, a grid of records and fields, except as a single structure in a binary data field which is meaningless to me when I’m reading the data sheet? I can think of several ways to do it, but would need not only to know the structure of the JSON file but also to consider how I might want to display, edit and manipulate the data afterwards.