@yorkshire-pudding revived #2231 and has provided a PR for it that I started reviewing. It didn't take long to find out that our current methods of trying to parse and replace variables in settings.php files aren't robust enough to support certain things such as variable declarations that span across multiple lines. The regex's that we are using are making too many assumptions, which may not always be true, some of the following ones being the most obvious ones:
- variables do start with a
$
, however that might not begin at the start of the line (add either one or multiple tabs/spaces might be added before it as indentation, and that would still be valid PHP despite the regex not being able to get a match on it). - variable values do end before the last
;
on the line, however that might not be the last character in the line (there might be tabs/spaces that follow it for instance, and it would still be valid PHP). - although our default settings.php file has all its variables and their respective values on the same line, that does not necessarily need to be the case for any php file to be valid. For example this:
php $var = 'some value';
...is also absolutely valid PHP if formatted like this (although against our coding standards):php $var = 'some value';
And more specifically related to #2231, while before we had this (variable defined in a single line):php $database = 'mysql://user:pass@localhost/database_name';
...we now want to have this (the value of the variable is an array, formatted to span across multiple lines):php $database = array( 'database' => 'database_name', 'username' => 'user', 'password' => 'pass', 'host' => 'localhost', );
And although we could have the regex parse multiple lines until it finds the first occurrence of);
, we cannot be absolutely sure that a randomly generated password does not include these characters. Consider this:php $database = array( 'database' => 'database_name', 'username' => 'user', 'password' => 'pa);ss', 'host' => 'localhost', );
...and even if we adjusted the regex to be "greedy" or tell it that we expect the);
to be at the end of a line (before a line break), people might still do something like this:php $database = array( 'database' => 'database_name', 'username' => 'user', 'password' => 'pa);ss', 'host' => 'localhost', ); // Here's a comment where you didn't expect it.
Yes, perhaps we could "flatten" that multi-line variable value into a single line when parsing it, like so:php $database = array('database' => 'database_name', 'username' => 'user', 'password' => 'pa);ss', 'host' => 'localhost');
...if we did that, then we could instruct the regex to parse everything up to the last);
in that line, however we cannot be sure that people will not be adding comments in their code. Consider this for example:php $database = array( // Some important note about this database connection. 'database' => 'database_name', 'username' => 'user', 'password' => 'pa);ss', 'host' => 'localhost', );
...if we attempted to "flatten" the above code into a single line, then we'd end up with$database = array( // Some important note ...
, which would break things. So we would need to strip comments out, and if we did that using regex, we'd have to make sure that we are accounting for//
or/*
occurring outside a string that might be otherwise legitimate (think'password' => 'pa/*ss',
). Only imagination is the limit on what can happen and how we need to account for the various possibilities.
Anyway, the point is that a) we would end up with some very complex (and thus easy to break) regex's, and b) even then there are so many things that can go wrong with the assumptions we are making in backdrop_rewrite_settings()
, since people might be manually editing their settings files in unpredictable ways. So I did some research trying to find a php parser that we could use, and luckily I came across token_get_all()
, which is a native PHP function that uses the in-built Zend engine's lexical scanner, so:
- no need for any additional libraries
- available since PHP v4.2.0 and all through recent versions of 5/7/8
- does most of the things we are after for free
- does it in the same manner that PHP would native parse the code
I starting experimenting with it, and it looks very promising!
Recent comments
We can no longer add contrib projects in the Tugboat sandboxes that we use for core PR's? Can this be fixed or is there a reason for this? We can add contrib projects to demo...
Apr 25th Weekly Meetings
The Mail System and MimeMail modules are now installed. I'll let you know if they solve the problem. Edit: Using Mail System with MimeMail I was able to send plain text emails. There is...
HTML Email treated as plain text
If you haven't already I recommend installing Mail System and MimeMail. The latter will help format emails as HTML and first helps with configuring which module will handle the formatting...
HTML Email treated as plain text