Posts

Pandoc for Document writing

I recently needed to write a brief report for university, and I was about to start it in Word, when I remembered that I had recently heard about pandoc by John MacFarlane.

Pandoc is written in Haskell (a language I’m trying to learn), and so I thought it would be fun an appropriate to give it a try. Happily, the Haskell environment is available on Windows and Linux (and I’d already installed it on both), so I installed it and got started.

As we all know, my editor of choice is Vim, so I fired it up and started a new document (let’s call it “test.markdown”). Initially I started with:

Brief Comments
==

Herein I describe my this document and the ...

Then I compiled this with the following pandoc command:

pandoc test.markdown -o test.pdf

Indeed this does produce a pdf! I was happy. It was, of course, formatted as if it had gone through a LaTeX stage; and indeed this is the case. Beautiful; “what more could I want!” I thought to myself.

Well, it turns out I shortly wanted an actual title for the pdf; the above produces only headings. So I soon found myself reading (probably with a little bit too much enthusiasm) the documentation on pandoc’s extended markdown format.

I noted with much happiness that the LaTeX-style math I included in the document was rendered appropriately, and you can do equation referencing in the typical way (just write the appropriate LaTeX command).

I was also very happy to learn that the correct way to start a document with a title is as follows:

% Title
% Author
% 

Brief Comments
==

Herein I describe my this document and the ...

(if you leave the line below Author blank, it will generate the date, otherwise you can simply write the appropriate data there). And I was also really overjoyed to find the section of bibtex support and bibliography style handling via CSL. In particular, the pandoc documentation will direct you to a github repository that contains an amazing number of bibliography styles.

From here I did as any good vim user would; I tried to find a vim plugin. I found one, called vim-pandoc. However, after a little bit of use on Windows, I noted that it was really slow in some aspects (mainly because it uses inline python in the plugin, instead of vim code). I noticed some other problems as well, so with the project being on github I decided to fork it: silky/vim-pandoc. My version will probably be in a little bit of flux over the next few days, but will hopefully stabilise shortly thereafter.

All-in-all using pandoc for the small report I had to write was successful, and I do hope to try it with future documents containing maths. In particular the markdown format also matches nicely with my other vim-based notes (so I could convert them if I decided it was appropriate) and it’s just plain easy to read and nice to use.

Another reason I was attracted to pandoc was because of the ability to output to slide formats (including beamer), so I’m really excited to give that a go.

And of course, this blog post itself was written in pandoc (ouput to .html)! So, maybe you will consider it the next time you need to write something!

Merging PostScript (.ps) files

I recently had to merge the output of a bunch of academic papers I had written in Latex. Each file uses a document class from a range of academic journals (has its own title, abstract, bibliography etc).

I could have made a single Latex file and shoe horned each individual file into it, but I knew there had to be a better (less labour intensive) way.

Using Latex you can create pdf (using dvipdfm) or ps (using dvips) files. There are various ways to merge pdf files (pdfsam) and ps files (using ghostscript).

However the quality of the merged file I produced was always very poor. Eventually (after much googling) I found a solution

gswin32c.exe -dNOCACHE -dNOPAUSE -sDEVICE=pswrite -dBATCH -sOutputFile=Output.ps Input1.ps Input2.ps Input3.ps Input4.ps

The “-dNOCACHE” option preserves the quality of the output file. The output file however is very large.

To ensure the page numbers in the merged document are continuous you can use the Latex command “\setcounter{page}{X}”