Building a personal website with org-mode

2019-09-06

It has been a while since I blogged for the last time. My blog was old, and I have been avoiding the redesign of my personal site for a very long time now.

I don't plan to port old blog posts, so I'll keep them as they are, a raw and unverified port from a Wordpress installation to Jekyll.

I wanted something simple, where I could write my thoughts as raw as possible. Luckily, I have been doing so for a long time using emacs's org-mode for everything personal and work related. It has made a difference in the way I organize, and thought it would be a good idea to apply the very same principles to my website and blog. So here we are.

I had some fun along the way with emacs lisp and I'll explain a little what I think are the most important points of my setup.

Structure

Notes

Notes are pages meant to be alive and changing over time. Annotations for my future self mainly; posted in the public just in case they help someone else. Undated.

Notes live under content/notes.

Blog

Exactly. Blog posts are what you think they are. Once published, not usually updated over time. Dated.

Blog entries live under content/blog/<year>/<month>/post-name.org.

Front page

The front page is really simple: it contains a list of all the notes and blog entries.

It is dynamically generated from the structure of the project.

RSS

RSS is an XML feed that only applies to blog posts.

It is dynamically generated as well.

Arbitrary static pages

The contact page is a good example of an arbitrary static page.

These pages are not included in the list of blog posts or notes.

Generation

The generation phase is triggered by calling to emacs in batch mode, like so:

emacs -L $(PWD) --batch --script init.el

init.el is a simple initialization script that will load all elisp files inside config. It will then call to (org-publish-all t).

org-publish project alist

Defined in config/default.el along with some customization functions.

website-content

Published using org-html-publish-to-html, will generate the HTML files out of org files.

Sitemap generation: `content/index.org`

Taking advantage of the sitemap functionality, content/index.org and the content/blog/feed.org files will be automatically generated. The sitemap-function of the website-content project is set to ereslibre/sitemap. It looks like:

(defun ereslibre/sitemap (title list)
  (progn (ereslibre/generate-org-rss-feed list)
         (format "#+options: title:nil\n
                  #+begin_export html\n
                  <div class=\"content container front-container\">
                    <div class=\"side-by-side\">
                      <h1 class=\"post-title\">Notes</h1><hr/>
                      %s
                    </div>
                    <div class=\"side-by-side\">
                      <h1 class=\"post-title\">Blog</h1><hr/>
                      %s
                    </div>
                  </div>\n
                  #+end_export"
                 (ereslibre/all-entries 'notes list)
                 (ereslibre/all-entries 'blog list))))

Writing raw HTML here helps in getting the desired HTML layout easier. This will be automatically generated anyway, and I don't have to fiddle with this anymore – I hope!

Take the first progn call to ereslibre/generate-org-rss-feed into account, we'll check what it does later.

Each sitemap entry is formatted using ereslibre/sitemap-format-entry, creating the link to the given entry:

(defun ereslibre/sitemap-format-entry (entry style project)
  (let ((date (ereslibre/org-publish-find-explicit-date entry project)))
    `(:content ,(format "<div class=\"post-preview\">
                            <h2 class=\"post-title\">%s</h2>
                            <span class=\"post-date\">%s</span>
                         </div>"
                        (org-export-string-as (format "[[file:%s][%s]]" entry (org-publish-find-title entry project)) 'html t)
                        (if date
                            (format-time-string "%Y-%m-%d" date)
                          "&nbsp;"))
      :entry ,entry)))

Feed RSS generation: `content/blog/feed.org`

ox-rss expects a single file with all blog posts, but this is not how my set up works, so the feed.org file will be automatically generated – this has some caveats, though.

The contents ox-rss expects are of the form:

#+title: ereslibre.es

* [[file:year/month/some-post.org][Some post]]
 :PROPERTIES:
 :RSS_PERMALINK: blog/year/month/some-post.html
 :PUBDATE:  2019-09-06
 :ID:       0b382fe7-f943-4997-8568-28179abe8f23
 :END:
Blog post contents, or description.

* [[file:year/month/some-other-post.org][Some other post]]
 :PROPERTIES:
 :RSS_PERMALINK: blog/year/month/some-other-post.html
 :PUBDATE:  2019-09-06
 :ID:       d0d46dcf-ae23-42e5-b279-b17956b3d82a
 :END:
Blog post contents, or description.

The call on ereslibre/sitemap to ereslibre/generate-org-rss-feed is what generates the RSS feed contents. content/blog/feed.org file will be created, listing the contents of all blog posts.

(defun ereslibre/generate-org-rss-feed (list)
  (let ((blog-entries (seq-filter (apply-partially #'ereslibre/is-entry-of-type 'blog) (cdr list))))
    (let* ((rss-contents (mapconcat #'ereslibre/rss-entry blog-entries "\n\n"))
           (full-rss-contents (concat "#+title: ereslibre.es\n\n" rss-contents)))
      (write-region full-rss-contents nil "./content/blog/feed.org"))))

The ereslibre/rss-entry function looks like:

(defun ereslibre/rss-entry (entry)
  (let* ((entry (plist-get (car entry) :entry))
         (title (org-publish-find-title entry (ereslibre/website-project)))
         (date (org-publish-find-date entry (ereslibre/website-project)))
         (link (concat (file-name-sans-extension entry) ".html"))
         (source-file (concat (file-name-as-directory "content") entry))
         (source-file-dir (file-name-directory source-file))
         (home-url-prefix (plist-get (cdr (ereslibre/rss-project)) :html-link-home))
         (contents (with-temp-buffer
                     (org-mode)
                     (insert-file-contents source-file)
                     (beginning-of-buffer)
                     ;; demote all headlines
                     (save-excursion
                       (while (re-search-forward "^\\*" nil t)
                         (replace-match "**")))
                     ;; remove certain attributes from inserted org file
                     (save-excursion
                       (while (re-search-forward "^#\\+\\(title\\|date\\).*" nil t)
                         (replace-match "")))
                     ;; transcode embedded links to files -- e.g. expand relative paths
                     (save-excursion
                       (while (re-search-forward "\\[file:\\([^]]+\\)" nil t)
                         (let* ((match (match-string 1))
                                (element (save-match-data (org-element-at-point))))
                           (when (not (or (eq (org-element-type element) 'example-block)
                                          (eq (org-element-type element) 'src-block)))
                             (replace-match
                              (concat "[" home-url-prefix
                                      (file-name-sans-extension
                                       (file-relative-name
                                        (expand-file-name match source-file-dir)
                                        "content"))
                                      ".html"))))))
                     (buffer-string))))
    (with-temp-buffer
      (insert (format "* [[file:%s][%s]]\n" (ereslibre/path-relative-from-to-relative-to entry "content" "content/blog") title))
      (org-set-property "RSS_PERMALINK" link)
      (org-set-property "PUBDATE" (format-time-string "%Y-%m-%d" date))
      (insert contents)
      (buffer-string))))

It is worth dissecting what tasks it performs:

Insert the contents of the target org file inside a temporary buffer
Demote all headlines from the inserted content
Transcode embedded links to other relative files. Since we are copying and pasting the contents of a file that is in other subdirectory (content/blog/year/month), all its relative references to other files will be broken when writing the content/blog/feed.org file
- ereslibre/path-relative-from-to-relative-to rewrites a relative path from the original directory, to a relative path from the target directory. This is used for the toplevel entries in feed.org and for correctness, since it's not really used when publishing – as far as I can tell
- RSS readers won't know how to handle relative links like ../../../contact.html, so all [file:some-file] occurrences will be transcoded into a [https://html-link-home/some-path/some-file], only if they are not in src or example blocks
Write the entry itself
- Insert the link to the blog post org file
- Add RSS_PERMALINK and PUBDATE org properties
- Insert the modified contents of the blog post

website-assets

Published using org-publish-attachment. This will copy all assets from assets inside public_html/assets. These are strictly template related assets.

website-content-assets

Published using org-publish-attachment. This will copy all assets from content to public_html. These are assets related to blog posts or pages themselves.

website-rss

RSS generation using the auto generated content/blog/feed.org file, that was created during the website-content publishing. It will only generate a target public_html/blog/feed.xml with a list of all the available blog posts.

Publishing

I wanted something really minimal. I migrated my whole website to Netlify and connected it to my GitHub's website repository. When I run a make publish, all contents get generated, and the Makefile tells the rest:

.ONESHELL:
publish: clean gen
        pushd public_html
        git init
        git add .
        git commit --no-gpg-sign -a -m "Publish static site"
        git remote add origin git@github.com:ereslibre/ereslibre.es
        git push -f origin master:publish
        popd

clean:
        rm -rf public_html

Contents will be pushed to a branch in that repo called publish, so Netlify will publish the website right after.

Caveats found

Some, but I will mention the most relevant ones only.

RSS with broken `<pre>` in `CDATA` sections

When creating the RSS feed.org file, ox-rss has a function that runs when the buffer has all the XML contents already written:

(defun org-rss-final-function (contents backend info)
  "Prettify the RSS output."
  (with-temp-buffer
    (xml-mode)
    (insert contents)
    (indent-region (point-min) (point-max))
    (buffer-substring-no-properties (point-min) (point-max))))

Turns out, (indent-region (point-min) (point-max)) will indent something like:

<description><![CDATA[
<pre class="example">
require (
  k8s.io/kubernetes v1.16.0-beta.1
)
</pre>
]]></description>

to something like:

<description><![CDATA[
<pre class="example">
require (
k8s.io/kubernetes v1.16.0-beta.1
)
</pre>
]]></description>

So, code examples wouldn't look that nice on RSS readers. I fixed that by defining my own final function that does not call (indent-region), after all, I don't expect anyone to read the XML directly.

ox-publish insists in adding certain elements

The global template case

Even when setting certain configurations like :html-head-include-scripts or :html-head-include-default-style to nil, I was still getting some template related elements that I could not remove with configuration settings, so I wrote my really simple org-html-template.

The `<p>` case

When creating the index.org contents, I started with the approach of using @@html:some-html@@[[file:some-org-file.org][A link]]@@html:other-html@@, so I could deliberately use org's feature of linking other files, while having control of the HTML directly to create the expected structure.

This didn't go well, as an extra <p> entity was printed at the beginning of the page, and moved the content a little. I could have fixed that with some CSS sorcery, but I didn't want extra output in my website either.

Then, I took the path that is currently used, use #+begin_export html and generate the org links manually while still relying on org's linking:

(org-export-string-as (format "[[file:%s][%s]]" entry (org-publish-find-title entry project)) 'html t)

`<title>`'s inside `<head>` with non-optimal contents

I didn't fix this issue, what I did instead was to change the title of this post. It was previously named:

#+title: Building a personal website with ~org-mode~

and I had to rename it to:

#+title: Building a personal website with org-mode

The first title output in the generated HTML was:

<head>
  <title>Building a personal website with <code>org-mode</code></title>
</head>

I'm fairly sure this is a bug, but the question is then, what it should be:

<head>
  <title>Building a personal website with ~org-mode~</title>
</head>

<head>
  <title>Building a personal website with org-mode</title>
</head>

Since the solution was really easy, and I was not completely sure I want headlines with different formatting, I ignored this problem and removed the special formatting from the title of the article.

Conclusion

I'm happy to have the same engine that drives my personal and work schedule driving my personal website as well.

There are some static website generators supporting org-mode format, but they are yet another component, whereas org-mode has already support for this features.

I have been thinking about doing this for quite some time, I only needed a small push; in my case it was Duncan's.

Reading and writing elisp has been fun. In fact, playing with emacs and org-mode APIs has been a quite enjoyable experience.

Looks like the IKEA effect at its best, you might think!

GitHub repository: https://github.com/ereslibre/ereslibre.es

Old blog posts: https://oldwords.ereslibre.es