MediaWiki and XSL-FO

At work, we’re using MediaWiki as both an internal and public wiki (we have two different ones, separated, to provide watertight bulkheads between them).

Recently, the administrator of the public wiki has been looking for ways to automatically generate PDF files. One extension in particular, Extension:Pdf Export, seemed to be useful; but on closer inspection we found out that the component it relied on, htmldoc, could only handle HTML 3.2.

Thus we were stranded. We tried PHP dompdf for a while, but it threw a fatal exception on the code output by MediaWiki, so that was a no-choice either.

But it seems like MediaWiki always generates XHTML-compliant output, which means that it’s possible to use a XSLT/XSL-FO parser. And Apache FOP seems to be a good choice right now; it’s Java-based, meaning that we can run it on the unix box with no problems (we hope!).

So, essentially, the way this could work would be to take the output from the PHP code described in Extension:Pdf Export above, but instead of running it through htmldoc, we run it through fop, kind of like this:

fop -xml generated.xhtml -xsl mediawiki-to-fo.xsl -pdf output.pdf

“generated.xhtml” is the file saved from the MediaWiki plugin, mediawiki-to-fo.xsl is a stylesheet that converts HTML into suitable XSL-FO definitions, and output.pdf is the generated result. FOP turns out to be quick and expedient.

Of course, this leaves us with generating the .xsl file, which is going to take some time. An excellent start is available at IBM DeveloperWorks.

All that remains now is putting the pieces together and we should have a simple, efficient plugin that generates beautiful PDF documents. If everything works as expected, that is … ;)

Essential PHP Security

Good book to read for anyone working with PHP.

It started off easy enough, you know, don’t trust input, always escape output, stuff like that. “Yeah, yeah”, I thought, “I learned that in kindergarten”.

But with each chapter, my attitude kind of changed from a “yada, yada” to “hmmmm” to “oops”.

The author doesn’t quite go into great depths of PHP programming, and there are some answers that are somewhat simplified; but with each merciless chapter, he brings up exploit after exploit, asking “did you think about this? and this? you thought about that, didn’t you?”

And you’re left wondering about that particular piece of code you wrote a few months back, because in the back of your head, you know that you didn’t think about that.

I have now resolved never to trust a programmer who hasn’t read at least a few good books about computer security (and can tell me which books he read).

Neither do I trust myself.

Windows Scripting

Over the past few days I’ve been trying to build a number of Windows server administration scripts – you know, for adding users to the Active Directory, creating new IIS websites, doing various stuff – and I get the distinct feeling that this is not something you normally do.

Whereas Linux has tools that do the job pretty much for you – everything is script-based anyhow, there is no point-and-click GUI – Windows requires you to dig into scripting, PowerShell, WSH objects, or any similar methodology to try to automate the task.

It’s almost like no one has ever done this before. And the few who have managed to penetrate the API’s enough to do so, have gone on to become employed at the Microsoft Scripting department as technology gurus.

There is a short clip from Monty Python that summarizes my feelings about this. It’s available here, in MP3 format.

File Access Rights

This seems to be the state-of-the-art way to change access rights on a file in PowerShell:

$colRights = [System.Security.AccessControl.FileSystemRights]“Read”

$InheritanceFlag = [System.Security.AccessControl.InheritanceFlags]::None
$PropagationFlag = [System.Security.AccessControl.PropagationFlags]::None

$objType =[System.Security.AccessControl.AccessControlType]::Allow

$objUser = New-Object System.Security.Principal.NTAccount(“domain\user”)

$objACE = New-Object System.Security.AccessControl.FileSystemAccessRule `
($objUser, $colRights, $InheritanceFlag, $PropagationFlag, $objType)

$objACL = Get-ACL “c:\testfile”
$objACL.RemoveAccessRuleAll($objACE)

Set-ACL “c:\testfile” $objACL

Funny… The unix version would be something like

chmod 600 /testfile

I… uh… I don’t know. I’m tired. I want to go home now.

I Wonder If I Actually Do Anything Useful?

I’ve been thinking. That’s never a good sign. :)

I build a lot of software systems. If I’m not designing a web platform for support cases, I’m building systems for PowerPoint presentations, or custom PHP frameworks, or… well, you name it.

But wherever I look, I’m almost invariably being replaced. The things I implement are being gradually replaced with standardized systems – which, I admit, is not a bad way to go. There is substantial power in a well-established platform with support behind it.

And yet, I cannot keep from dreaming. I see things… better ways of doing things, better designs, better user interfaces. I see ways of improving things. It’s like there is resident within me a power to dream; a power that is relentless, that causes me to skip five steps ahead when others just see two. The question that burns within me is the constant “why not?” that forces me to challenge everything, including myself, and strive for an elegance in software design that I otherwise seem to see so little of. Not that I’m bragging, I just… dream.

But so little of what I do can be maintained. It’s like I’m destined to be an oddball that pioneers ahead, but is always replaced by a standard product after a few years. And it leads me to think.

Do I actually do anything useful? Does it matter what I do? So much of my heart and passion goes into things that no one will ever see. Am I, in fact, a roadblock to other people? Do I paint myself and other people into corners which they will then have to get out of?

Maybe I should stop and just use normal off-the-shelf tools. Use Drupal or WordPress instead of building my own system. And yet, it’s difficult to bring myself to do so because it’s so ugly and normal and conventional and limiting. It feels like I’m being relegated to writing instruction manuals for blenders, instead of some new novel I’m dreaming of…

Is there a place for dreamers in our society? Where do I really fit in?

Will anything I do ever last?

“I don’t really want an answer. I just want to send this cosmic question out into the void. So good night, dear void.” (You’ve Got Mail)

Cooling Down

We started getting spurious temperature fluctuations in our server room a while ago. Each night, temperature would rise to above 30C for about half an hour, and then go down again.

Over the past couple of days, it grew increasingly erratic and threatened the entire server room functionality — outages grew more and more frequent and finally we had to put another A/C unit in there, with open doors (because of the vent pipe) to try to contain it.

With something like 15-20 servers, it’s incredible how much heat a server room can generate.

The problem is that once temperature starts rising, some servers start malfunctioning. We had one server stop responding to TCP/IP, so it had to be restarted. And hard disks start crashing soon if temperature doesn’t go down again. One colleague of mine had backup tapes literally melt inside the backup unit once, he told me. So it’s been with a mild state of panic that we’ve watched the temperature go up and down this past week.

Of course, we tried to make repairs and isolate the error. Is it a malfunctioning control board inside the unit? The compressor? Is the current on to the contactor in the external unit? … Digging into a 380V external appliance to measure voltages was a first for me :)

In the end, it turned out to be a current protector that was faulty. It’s designed to shut off if the current grows too strong, to protect the compressor, and it was misbehaving. It was replaced, and voila, everything ran perfectly again.

It’s amazing how such a small little cheap thing can affect so much.

A Problem with the Windows Model

My services.exe process was consuming a lot of CPU resources this morning, for no apparent reason.

If I examined this process, I saw that it had created three different threads that were equally busy in working on some unspecified task. No I/O performed, no network activity, just a lot of CPU resources on three different threads.

In Linux, this would have been less of a problem, because in Linux, each process is designed to perform one single job, and do it well. When task A needs to be executed, /usr/bin/taska is started. When task B a little later needs to be carried out, /usr/bin/taskb is run. But in Windows, one process can host a vast multiple jobs, and frequently, a number of different roles are included in one single application. Of course, since process creation is an expensive task in Windows, it makes sense to use threads instead. But you lose out on clarity.

Well, I don’t know why my computer disbehaves like this. I guess I’ll go over it with my antivirus utility and then reboot, to see if the problem goes away. That’s about all I can do.

Old Pictures

Going through the archives, and I found a couple of oddities that I think I might save…

The above is a design I made for a remoting layer in our switchboard application. It provided the means to transparently switch search facilities from a local, threaded approach to a remoting approach, each using the same database backend. (Having no threads means that the executable is less difficult to code.) Blue is the front-end GUI, green the application logic, and pink server-side stuff. I love that I painted a little shining sun in it.

(It’s not my real handwriting, it’s my hand-on-mouse-in-PaintShopPro-writing.)

In stark contrast, I guess, the picture I won the “aggressive” competition in our little photo contest with. It turned out so beautifully. The black and white effect is really nice – I probably used conventional CN-41 black and white film. The “blood” is actually ketjap manis, a soy-like liquid used in cooking. At first I placed drops of it on an old linen sheet – which quite didn’t produce the effect I hoped for. But when I suddenly smeared it out, I saw the picture unfold in front of me.

And with this one, I won the “odd locations” competition. Nuff said. :)

I {heart} PHP

Reasons why I love PHP:

  • It’s quick.
  • You can build your own framework. (Or use someone elses.)
  • The whole PHP distribution can be compiled into one single, distributable binary.
  • It’s got a rather sweet mix of raw power, raw utilities, caches, add-ons, and dangerous things you can do with it. If you know what you’re doing. Kind of like C, but without the hassle.
  • If you’re careful enough, you can actually develop new features directly in the production system.
  • A nice mix of classes, reflection, and syntactic sugar.
  • The same source files can be deployed anywhere – Windows, Linux, Web hotel A, Web hotel B, localhost… no additional configuration necessary. Painless distribution through xcopy.
  • Small footprint.
  • No System.Web.Utilities.Page.PageHelper.Random.Help.ImLost.RandomAccessAttribute.VeryImportantSetting.
  • It doesn’t use Xml and helper classes to configure a website. Just make an include file. Or use parse_ini_file.
  • …and you can edit the darn thing in notepad. (No big Visual Studio to install. No dependencies. No dlls. Nuff said.)

Sure, there’s no strict typechecking, it’s not binary, and you need one or two tricks to make it really fast. But the easiness with which you set projects up, the power behind it, and the fact that you can pretty much beat it into doing anything you want – while at the same time being able to distribute it anywhere and it just works right out of the box… Yep, PHP won my heart.