This is not a security issue per se, but an exploration of several ideas to further secure Backdrop core. I will base my ideas on the Fingerprint Web Application Framework of The OWASP® Foundation.

The OWASP® Foundation works to improve the security of software through its community-led open source software projects, hundreds of chapters worldwide, tens of thousands of members, and by hosting local and global conferences.

So here is the introductory quote from the summary part of the report:

Knowing the web application components that are being tested significantly helps in the testing process and will also drastically reduce the effort required during the test. These well known web applications have known HTML headers, cookies, and directory structures that can be enumerated to identify the application. Most of the web frameworks have several markers in those locations which help an attacker or tester to recognize them. This is basically what all automatic tools do, they look for a marker from a predefined location and then compare it to the database of known signatures.

So the idea is to check how Backdrop is doing with regards to each of the several locations malicious parties, including automated bots, usually test to identify applications, and possibly remedy them by obscuring any revealing data either via custom module or, if Backdrop community's consensus finds doable, introducing necessary changes into the core:

-     HTTP headers
-     Cookies
-     HTML source code
-     Specific files and folders
-     File extensions
-     Error messages

HTTP headers

There are two pieces of information in the HTTP headers that sell out the type of a web-application and they are Generator and Expires META tags.

  1. There is a contributed module Remove Generator META tag that takes care of the first one. I wonder how important to indicate Backdrop CMS in the Generator META tag? Is there any chance this could be changed in core?

  2. As for the second one, for years I knew that one of the easiest way to check if a website is a Drupal or not, is to look at theexpires line of HTTP headers:

Sun, 19 Nov 1978 05:00:00 GMT

Unsurprisingly (as a fork) Backdrop has inherited the date and the hour when Dries Buytaert was born. I do respect the guy and know we - Backdrop users also - owe the guy, however I'd rather not pass the same trace to every Backdrop website out there. Instead, I would suggest to automatically change the expire line in the headers during the initial installation to some random number, so every Backdrop site has a different expire information. If for some reason a specific date is required then let's be inventive and change it to when Backdrop was born, or if Nate isn't against let's have his birthday there ;)

Cookies

I've just checked the cookies of my browser and, unfortunately, found traces of Backdrop, e.g.:

Backdrop.tableDrag.showWeightdomain.tld

How the community finds the idea of not leaving any information in the cookies which hints to Backdrop?

HTML source code

If you look at HTML code of any Backdrop website then you will also find the word Backdrop in several other places, such as:

@import url("https://www.domain.tld/core/themes/basis/css/component/backdrop-form.css?qlm1ff"); <script>window.Backdrop = {settings: {"basePath":"\/","pathPrefix":"","drupalCompatibility":true,"ajaxPageState":

<script src="https://www.domain.tld/core/misc/backdrop.js?v=1.17.4"></script>

How about if we change the name of such files as backdrop-form.css, backdrop.js and remove all other occurrences of Backdrop and Drupal that are output in the HTML code?

Specific files and folders

Currently, the core reveals all the sensitive directories by mentioning them in the robots.txt file:

# Directories
Disallow: /core/
Disallow: /profiles/
# Files
Disallow: /README.md
Disallow: /web.config
# Paths (clean URLs)
Disallow: /admin/
Disallow: /comment/reply/
Disallow: /filter/tips/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
Disallow: /user/logout/
# Paths (no clean URLs)
Disallow: /?q=admin/
Disallow: /?q=comment/reply/
Disallow: /?q=filter/tips/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/
Disallow: /?q=user/logout/

but I always wondered how those directives are effective since number of bad web-crawlers are known to totally ignore them. Also we already limit access on webserver level (Apache, nginx, etc.) by chrooting, attributing files and directories to users and groups with limited access. Additionally, Backdrop has .htaccess files to limit access to directories on Apache servers. On Backdrop level we never let unauthorized users to go paths like /admin, for instance.

So I wonder how terrible is the idea of totally ditching the robots.txt file? If you think it would hurt SEO rankings, in fact the effect could be just the opposite, because without a robots.txt file search engines will have a free run to crawl and index anything they find on the website.

To conclude, let me quote from the referenced report again, this time the Remediation part:

While efforts can be made to use different cookie names (through changing configs), hiding or changing file/directory paths (through rewriting or source code changes), removing known headers, etc. such efforts boil down to “security through obscurity”. System owners/admins should recognize that those efforts only slow down the most basic of adversaries. The time/effort may be better used on stakeholder awareness and solution maintenance activities.

GitHub Issue #: 
4817