import an xml into phpmyadmin

Given that WordPress doesn’t allow to import a local file (and this is quite unfair, in my opinion), you can transform a wp site into a xml file and then import it in your local database via phpmyadmin.

But you have to format carefully the xml: look how phpmyadmin export an xml file and format your xml according to that model.

regex replace html tags

In exporting a odt file to epub LibreOffice can make many mistakes, such as get a 2nd level title not with <h2>, but with <p class=”para0″>. To fix this error, you can use regex, in this way:

find: <p class="para0">(.*?)</p>
replace: <h2>\1</h2>

and so on for similar cases.

regex “whatever”

If you want to select “Whatever” (word or character), regardless of its length, you can simply use


For example if you want delete all the words between <span> and </span>, as in the following row

many words <span>many other words here</span> other words

you can use

delete <span>(.*?)</span>. 

The result will be:

many words other words

automatic crop pdf margins

I found this excellent tool: PdfCropMargins, a very light app (both for Linux and Windows), which can crop automatically pdf white margins (top,bottom,left,right) even if they are very irregular (different from page to page). With a great accuracy. And without growing the pdf size.

You can use the gui, starting from a command line, such as pdf-crop-margins -gui “original-pdf.pdf” -o “target-pdf.pdf”.

save only some pdf pages from a pdf

You can use Okular, of course: print to file -> the pages you want. But thus you get an image pdf. If you want keep a text pdf, you can use a program like PdfArranger, which allows to save (only) the pages you want from a whole pdf, keeping them as searchable pdf (text).

quick type special characters

In Linux you can use ComposeKey, setting it for example (in System settings) as RightCtrl (the right-Ctrl key). RightCtrl is better than AltGr in Italian keyboard, to keep AltGr for some characters like ‘[‘, or ‘]’, or ‘@’, or ‘#’, otherwise unaccessible.

In that way, when you type 1) first RightCtrl 2) then ^ 3) then o, you will get ô. You don’t need to press simultaneously all the keys.

To sum up, the main simbols :

  • RightCtrl+^+o = ô
  • RightCtrl+”+o = ö
  • RightCtrl+’+o = ó

re-ocr a pdf with Adobe

You have to 1) save the old searchable pdf to tiff images (as many as the pages), 2) ocr the tiff images to a searchable pf 3) combine the new multiple pdf to one pdf.

problems with phpmyadim

Sometimes it happens that phpmyadmin (/mysql) don’t allow you to do what it should allow, such as change the encoding of a column (or of a table or of a database), or change the engine of tables.

Then, after many failed attempts via sql query, I found that the easiest solution is

  • export the database
  • do the changes you want through a text editor, such as Kate, i.g., replacing the old enconding with the new one
  • import the (modifyed) database (after deleting/renaming the old one)
  • done!