Hello Everyone,

In one of my many attempts to increase my facility with the family of technologies known as "CMS systems" I have run across a variety of what may be called persistent store strategies.

These strategies are designed to help in-memory systems store important information in such a way that it can be reliably reproduced despite the computer system hosting them experiencing a power outage or even planned interruption of service (migration).

Historically, the memory employed for this purpose was secondary, and the media magnetic. More lately (mostly for performance reasons) secondary storage  has evolved in the direction of various types of SRAM.

In the "Backdrop CMS" system (a fork and subsequent evolution of the "Drupal CMS" system), among the many new concepts introduced was one targeting the persistent storage of critical configuration information:  .json formatted files.

For example, the configuration file for the Backdrop CMS core language module at:

/modules/language/config/language.settings.json

Contains:

{
 "_config_name": "language.settings",
 "_config_static": true,
 "language_negotiation": [],
 "languages": []
}

JSON stands for "JavaScript Object Notation".  Literally speaking, it is a foreign concept to the back-end technology that was used to develop Drupal and its offshoots, including the "Backdrop CMS" system.  That base technology is PHP.

Since the release of PHP 4 (22 MAY 2000), a native solution has been available to any PHP developer seeking a persistent storage approach for retaining server-side configuration information:  .ini files.

https://www.php.net/manual/en/function.parse-ini-file.php

So why was .json chosen?  I am not asking this to cast doubt or aspersions on .json or even the people who chose to go with .json - I am just curious about why .json was ultimately selected as the "best" choice in the face of the other structured data notations  that were available and already present in D7 (xml also comes to mind).  

How come .json won?

I guess, ultimately, what I am looking for is a bit of a history lesson and a narrative.

g.
----

Most helpful answers

XML is used for RTI payroll submissions to UK tax authorities (HMRC). I wouldn't like to have to prepare manually.

When you look at the complexity of Views config files, I do think JSON is best for the variety of config needs

Hi Graham, thanks for posting the question. Thanks @stpaultim for linking to the previous research. I haven't redone the benchmarking, but my general experience is that since then both JSON and YAML parsing have gotten faster. YAML has a lot more ways to go, but still it's being parsed by PHP, rather than C code like JSON, so still expect JSON to be at least 10x, probably 20x faster still.

As for .ini and why that isn't used, it's pretty simple: it only supports 2 levels of nesting (if you include grouping). The .info files that Backdrop uses are actually almost .ini files, the history of why they're not real .ini files goes back way before Backdrop forked from Drupal. We actually adding grouping support, making .ini and .info files even more similar in https://github.com/backdrop/backdrop-issues/issues/81. But I recall that the general concern was that .ini files could not handle array syntax, such as those used when specifying CSS files in themes:

stylesheets[all][] = one.css
stylesheets[all][] = two.css

This format wasn't supported by .ini files, so .info files (and manual parsing) was introduced instead. This decision predates the creation of Backdrop, and I'm not sure I'd make the same decision today, but we're committed to backwards-compatibility so we'll be supporting .info files for quite some time to come.

In any case, the .ini format not supporting more than 2 levels of nesting, and .info files being parsed by PHP and being dramatically slower than JSON files sums up why neither was appropriate for storing configuration.

Your summary of why JSON was chosen is a pretty good one. I think that summarizes all the points pretty well. 

Comments

I don't know the answer to this question and was not around when these decisions were made. 

I do know that Drupal 8 was looking at using YAML files for config management and Backdrop decided to go with JSON. 

I never heard of any discussion about other alternatives, such as .ini file. 

This discussion from the Github archive MIGHT shed some light on what was being considered and why, but I've not ready through it all.

See issue #2:

https://github.com/backdrop/backdrop-issues/issues/2

I'll ask around and see if anyone has anything to add this question. 
(See also: https://forum.backdropcms.org/forum/how-does-backdrop-store-role-permiss...)

Hello @stpaultim,

Thanks for the link.   Scanning it through, this seems like a hotly debated topic.

  • .yml files were strongly considered
    • Especially with the introduction of Symphony
  • JSON was observed to be MUCH faster than .yml
    • Like 20 to 50 times faster
  • JSON was described as being better at handle multi-byte encoded strings
    • I am not sure that .ini files can even handle UNICODE all that well.
  • From my examination of .ini file format, JSON was MAYBE seen as being better at handling more complex (i.e. n-dimensional) data representations - which Drupal is absolutely replete with - than the .ini could natively handle.  A lot of entries in the PHP .ini area have to do with making it better at handling complex data types.

Guess:  Maybe it was just a "path of least resistance" thing?  

It might just be that:

  1. .json ticked all the technical boxes (fast, UNICODE, n-dimensional)
  2. .json had already turned out to be a highly useful data transport mechanism
  3. JS was already being used extensively on the client-side
  4. REST architectures are/were highly .json friendly
  5. REST was "cool" back then, and JS/TS was on the ascendance
  6. Everyone was already messing around with JS on the front end, so familiar
  7. .json was the lowest-friction option in terms of persisting configuration data

Does anyone know different?

 

g.
----

 

quicksketch's picture

Hi Graham, thanks for posting the question. Thanks @stpaultim for linking to the previous research. I haven't redone the benchmarking, but my general experience is that since then both JSON and YAML parsing have gotten faster. YAML has a lot more ways to go, but still it's being parsed by PHP, rather than C code like JSON, so still expect JSON to be at least 10x, probably 20x faster still.

As for .ini and why that isn't used, it's pretty simple: it only supports 2 levels of nesting (if you include grouping). The .info files that Backdrop uses are actually almost .ini files, the history of why they're not real .ini files goes back way before Backdrop forked from Drupal. We actually adding grouping support, making .ini and .info files even more similar in https://github.com/backdrop/backdrop-issues/issues/81. But I recall that the general concern was that .ini files could not handle array syntax, such as those used when specifying CSS files in themes:

stylesheets[all][] = one.css
stylesheets[all][] = two.css

This format wasn't supported by .ini files, so .info files (and manual parsing) was introduced instead. This decision predates the creation of Backdrop, and I'm not sure I'd make the same decision today, but we're committed to backwards-compatibility so we'll be supporting .info files for quite some time to come.

In any case, the .ini format not supporting more than 2 levels of nesting, and .info files being parsed by PHP and being dramatically slower than JSON files sums up why neither was appropriate for storing configuration.

Your summary of why JSON was chosen is a pretty good one. I think that summarizes all the points pretty well. 

XML was used briefly as well (during development), which was mentioned in that issue linked above. I didn't know this until just reading it now.

And that issue linked to this discussion about why XML was a strong contender as well: https://groups.drupal.org/node/167584

So on to our good friend XML. Honestly, the main downside to XML seems to be that whenever people hear about it, they say 'Ewwwwwwww, not XML!' and frankly I was in that camp too. However there are a lot of compelling reasons to consider XML. It is easily the most interoperable format available, pretty much any external tool will be able to deal with it. There are a ton of tools on all major platforms for editing XML. The major IDEs will deal with it nicely. While it is not as performant as say JSON, SimpleXML is native to PHP and we just have to write a routine to bring that back and forth from PHP arrays.

Hello @herb,

I seem to remember XML being implicated in m2m transmissions in the Drupal world, and the fact that it has been mentioned very recently in this Forum (XML sitemap?), I also seem to dimly remember that XML is implicated somehow in Drupal RSS feeds, but I have not (yet) confirmed that fact so I might not be remembering that correctly.

XML was all the rage when I entered the workforce in 1996.  When I was with IBM right around Y2K, they became hugely invested in XML because they saw it as a newer, faster, cheaper and ultimately better way to solve the kinds of problems that the EDI network had until that time addressed almost exclusively.  Think SWIFT.

XML was seen as a faster, better (and more open) way forward, and IBM leaned into it very, very hard.  When I left North America for Asia in the early 2000's, XML figured highly in the work I was doing at Microsoft.  Many of the courses we were delivering and solutions we were developing for corporate customers across Asia used XML to represent and transport highly complex data structures.

There's no doubt in my mind that XML is a very, very powerful bit of tech.  It can also quickly become super complex, so using it as a means of storing configuration information seems like major overkill, especially when you compare it to available and simpler technologies with much flatter learning curves.

In the case of persisting BCMS configuration information, I guess it just turned out that .ini files were too simple and .xml files were too complicated, so .json was pitched as a "goldilocks" solution - and the decision stuck.  It also didn't hurt that .json acted as a gateway to other fascinating aspects of the JavaScript world, which in and of itself is a massive and hugely fun topic.

Personally speaking, I have nothing against almost any file type, I just find some harder to come to grips wth others, and usually they are all a huge drag to create by hand - and I seem to find myself constantly creating them by hand :)

Why did  I even bring this up?  Right now, I am looking at .ini files as part of doing stuff in the D7 world.  I am basically using that project as an excuse to try to knock the rust off my "programming mind", such as it is (or ever was).  

Over the past few days I have been writing a VERY basic .ini management tool mostly just to force myself to become familiar again with PHP programming in preparation for hopefully more useful and ambitous work to come.

g.
-----

 

XML was the way forward back in the early 90's

When I was an IT manager for a manufacturing company, XML was the data interchange format for all external invoices to the UK Government and big industry... But back then, it was a pain :-)

XML is used for RTI payroll submissions to UK tax authorities (HMRC). I wouldn't like to have to prepare manually.

When you look at the complexity of Views config files, I do think JSON is best for the variety of config needs

Hello @yorkshirepudding,

I think the Views module is really powerful, if a bit unusual.

I don't think it is Query by Example (QBE), nor Query by Form (QBF).

In fact, I don't even know what category it fits in...QBD? 
(QBP = "Query by Drupal")?

hahahahaha.

I love it, but I find it...arcane...mostly due to my own unfamiliarity with how it works.  Views is no outlier in this regard.  I have great difficulty finding documentation for a lot of the components I need to use (including Views), and when I do find documentation I often find voluminous text that I need to read it multiple times because I "just don't get it", or nothing at all.  I almost never find documentation that I can action on immediately.  It can be frustrating.

Yes, it seems to me that .json was a good choice for this application for a myriad of reasons we've already noted.  It is especially good if BCMS features a .json file visual builder to help avoid having to hand-code n-dimensional config files, which can also be frustrating.

I guess you get mentally used to "the Drupal way" after a while, but it can be discouraging for people who have a problem they want to solve quickly and easily without having to develop a ton of background and orientation to simply find things.

It's also worth noting that Drupal itself is elbows deep in n-dimensional data models with respect to the stdClass Object(s) it is constantly creating, toying with and passing around internally.  These data structures are typically bristling with layer upon layer of attached arrays (dimensions), so any persistent store oriented notation would need to be up to that task.

g.
----

 

Call for comments...did I get anything wrong?