Writings Photos Code Contact Resume
Prayer timetable parser for Helsinki (2011)

You are here

Submitted by msameer on Sat, 08/01/2011 - 10:12pm

One of the local mosques here used to publish the prayer timetable for the whole year in HTML format. That was fine for me because I had the ability to add a bookmark to the current month to my N900 desktop.

In 2011, some brilliant person decided to publish them only in PDF. The idea itself is not that bad, I just lost the ability to add the bookmark to the desktop. Combine that with the not so great PDF reader for the N900 and my dislike for PDF files in general and you shall understand why I was really annoyed.

I decided to try to convert the PDF to html. htmltopdf is a great tool but the HTML output was horrible and almost impossible to properly cleanup. Each cell was represented by an HTML div that had an absolute position!

At the end, I decided to write a very crude parser that can parse the XML generated from pdftohtml and dump the data to a set of HTML files that I can use and bookmark.

The parser is not a nice piece of code but it works. The HTML files are ugly too but they are fine for me.

The code can be obtained from my SVN repository (C++/Qt). The generated HTML files are in this tarball.

The original PDF can be downloaded from the mosque website.

Warning: I take no responsibility if the generated times are incorrect. You have been warned!

Add new comment

Subscribe to /  digg  bookmark