Bash loops are fun for batch conversion.

So I’m in the middle of converting a bunch of documents to my meta-format of choice, MultiMarkdown. Now, I was already thinking along the right lines when I created all these documents (an open format that would be accessible, but still holds format). About 90% of these files are rich text. So I decided I could use a few, simple bash loops and I would probably be done converting everything. (Note: These are pretty crude scripts, and probably could be written much better, but they work for my small data set.) First we will use Apple’s textutil to convert from .rtf to .html, then use pandoc to convert the .html to markdown.

textutil loop (mac os x only! sorry.):

#!/bin/sh
              FILES="*.rtf"
              for FILE in "$FILES";
              do
                textutil -convert html $FILE
              done
              

pandoc loop:

#!/bin/sh
              find . -name "*html" | while read FILE
              do
                html2markdown "$FILE" -o "${FILE%.*}".mmd
              done
              

To make these scripts work, save them in their own plain text file with a .sh extension, then put that file in the same folder as all the files you want converted, then open a terminal (or command prompt or command line… whatever. Open it.) and type:

sh HTMLtoMarkdown.sh

Magic.

comments powered by Disqus