Hello again,

I'm searching for a method to change the url-transliteration which is executed (automatically?) when using path aliases.

For example "ü" is changed to "u", but I would prefer to have "ue".
Another example: I use a token of a field which contains a dot, which is stripped: "b1.01" results in "b101" - I wolud prefer to keep the dot.

I've been searching for transliteration and found the info that it is include in core, but to me it seems to deal only with file names, not wiht path aliases.

Can somebody please give me hint?

Thank you to all and best regards

Accepted answer

Regarding the dot (or period), I recommend to have a look at the page "URL alias pattern settings" (admin/config/urls/path/patterns/settings). At the bottom of the page you find a collapsed "Punctuation" section, where you can define if punctuation marks get removed or replaced, and how.

Comments

Hi. I've never done this, and haven't spent a ton of time looking at the code, but it seems like the alias is "cleaned up" in line 212 of path_clean_string(), which calls transliteration_get(). 

If you keep digging, you'll see that this is handed to the transliteration module, which does the replacements in _transliteration_replace(). 

If you have German as your default language, ü should be automatically transliterated to ue, since that's how it's set up in the file core/includes/transliteration/de.php

I have not tried this! But my guess is that you can similarly specify a transliteration "map" file like de.php called en.php (for English) and put it in core/includes/transliteration

There may be a better way to do this. I don't like the idea of changing anything inside the core folder, as that may get deleted when you update core. 

I just tried adding a file called en.php in that location with this code for ü and ö and it works.

<?php
$overrides['en'] = array(
  0xF6 => 'oe',
  0xFC => 'ue',
);

 

I think I can help and will share my experience.

Not only filenames are definitely transliterated, URLs are also transliterated, but the transliteration language must be enabled as a language in settings and content translation allowed as module as well as at content type level.

A few years ago, I needed a correction of the way of transliterating Bulgarian language into the English alphabet, which is universal and built into the system by default.

I found I could do this by editing the bg.php file located in core/includes/transliteration

bg.php - this is the file for the Bulgarian language, for your language you need to find out which one it is, and if you don't create it as the code for the language is based on international standards, which at the moment I'm having trouble specifying, but I'm sure there are community participants who can.

In fact, I did this even with Drupal 7, with Backdrop there was a little specificity, but with trial and error I learned how to transliterate correctly in Bulgarian, and many searches on the Internet helped me find the codes for the individual symbols in the Bulgarian language.

After building a properly working bg.php file for some time I just replaced this file after every core update.

After some more time together with the community I managed to get the correct transliteration built into core.

I later helped embed the change into core - traces of that remain here:

https://github.com/backdrop/backdrop-issues/issues/1604#issuecomment-703...

https://github.com/backdrop/backdrop-issues/issues/1604#issuecomment-711...

 

 

Hello argiepiano and amilenkov,

thank you very much for your hints!

For transliteration I copied de.php and renamed it to en.php (because I don't have german enabled).

For the dot, I uncommented line l. 216 in core/modules/path/path.inc

Obviously, modifying core is not best practise, but for the moment  I'm happy!

Thank you and best regards

Olafski's picture

Regarding the dot (or period), I recommend to have a look at the page "URL alias pattern settings" (admin/config/urls/path/patterns/settings). At the bottom of the page you find a collapsed "Punctuation" section, where you can define if punctuation marks get removed or replaced, and how.

OMG... don't know why i missed that... thank you so much, Olafski!