As a project to get to know Backdrop, I want to convert the existing Wordpress site of the local environmental organisation that I'm involved with into a Backdrop site.

As a sidenote: I used Lando to spin up a Backdrop site as well as a Wordpress site on my local desktop computer. I learned about Lando during the recent Backdrop Live event and it seems very useful piece of software! I plan to write some documentation on using Lando for developing new Backdrop sites and I will probably start a separate Forum topic to get some feedback on that.

The Wordpress site is relatively simple: it contains 118 posts, 20 pages and 460 media files. It was pretty straightforward to export the full content of the site and I saved the resulting WXR file on my local hard drive. On the blank Backdrop site I installed the Wordpress Import module and I tried to import the WXR file by using Content -> Wordpress Import. The file is uploaded to the Backdrop site (into the directory /files/wordpress), but it generates the following error:

This file does not appear to be a valid WXR file. The file is either corrupted or invalid XML. In some versions of WordPress, the export function can produce malformed XML. Please see README.txt (included in the module archive) for further guidance.

I then tried to import the WXR file on the blank Wordpress site that I created via Lando and there the WXR file was imported flawlessly. So there's obviously something wrong with the module. I've reported the issue on Github.

 

Comments

indigoxela's picture

Hi zilvervos,

I wonder what a "second opinion" on this xml file reports, for instance a command line tool like xmllint.

There have been some very old issues (D6) regarding either the atom field or some utf8 characters in the file.

Possibly Wordpress itself is very permissive regarding xml when importing, but the Backdrop module is pickier?

Hi Indigoxela,

Thanks for you thoughts on this.

xmllint reports ten errors:

  1. CData section not finished
  2. PCDATA invalid Char value 3
  3. Opening and ending tag mismatch
  4. Sequence ']]>' not allowed in content
  5. Opening and ending tag mismatch
  6. PCDATA invalid Char value 3
  7. Sequence ']]>' not allowed in content
  8. Opening and ending tag mismatch
  9. Opening and ending tag mismatch
  10. Extra content at the end of the document

Should I try to repair the issues manually on the basis of the xmllint output and then retry importing?

indigoxela's picture

Oops, ten errors seem a lot to me...

Should I try to repair the issues manually on the basis of the xmllint output...

I'd say, it's worth a try. ;-)

And it would also be interesting, if the Backdrop module would be able to fix / circumvent these xml errors - possibly not.

I will start with trying the first option (cleaning up the WXR file). Judging from the activity on the Github page for the Wordpress Import module, it looks like it's not actively maintained... The code looks pretty readable, so if cleaning up the file doesn't do the job I'll dive into that a bit more (even when I'm not a php-programmer).