Monthly Archives: December 2017
XEmacs: Sorting Key-Value Lines by Value
XEmacs 21.4.24 [direct ftp download] and the latest stable release (2015) is the version I’m personally using. The directions here may well apply to GNU Emacs as well; I don’t know.
Most Emacs users are familiar with the command
M-x sort-lines which alphabetically sorts the lines highlighted in the current buffer.
However, I had the wish to sort
key: values as follows, by the value.
foo: 12 baz: 7 bar: 2
As you can see, in this instance, the value is numeric and lexicographical sorting of numbers results in
foo: 12 bar: 2 baz: 7
that is, alphabetical and not numeric.
In order to “fix this” we have to delve into the sorting internals of XEmacs.
First, let’s look at the function sort-lines, which is fairly straight forward.1
(defun sort-lines (reverse beg end) ;; [documentation string elided] (interactive "P\nr") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr reverse 'forward-line 'end-of-line))))
The most important thing to note here is the use of the function
sort-subr. The rest of the code is simply boiler plate to limit the sort to the highlighted region. The
(interactive "P\nr") is to read the start and end of the region into the argument values
end. For our purposes, we can simply treat this as “magic”; the effect of the newline embedded in the string is to separate the
reverse (read with
"P") from the region (read with
"r") and that means people can sort in reverse alphabetical order with
C-u M-x sort-lines. The code we’ll develop in here does not have this feature; adding it can be considered an exercise for the reader.
Alright, now we know how
sort-lines works, and we know that we can use
sort-subr to sort by values, the question is how do we do it?
First, and obviously, we read the documentation string with
C-h f sort-subr. This tells us that there are two variables we can make use of,1
STARTKEYFUN moves from the start of the record to the start of the key. It may return either a non-nil value to be used as the key, or else the key is the substring between the values of point after STARTKEYFUN and ENDKEYFUN are called. If STARTKEYFUN is nil, the key starts at the beginning of the record.
COMPAREFUN compares the two keys. It is called with two strings and should return true if the first is “less” than the second, just as for `sort’. If nil or omitted, the default function accepts keys that are numbers (compared numerically) or strings (compared lexicographically).
The first thing we note here is the
startkeyfun. It’ll allow us to limit the sort comparison to the value part of the lines. The trick here is to just move the point past the
: (colon). We can do that with
search-forward. Since in my case, and the example here, all the lines do have a colon, we’ll not consider the case where it might be missing in the line, hence we don’t impose any limit on the search (
nil) nor do we care about errors (we allow them with
nil); however, we limit the count to exactly one.
That leaves us with a call that looks like
(search-forward ":" nil nil 1)
and to apply that to
sort-subr we define an entirely new function,
my-sort-key-value-lines. We wrap
search-forward in a lambda for simplicity. Notice that we return
nil explicitly from the lambda, because otherwise
sort-subr will use its return value (the location of point in the buffer after the colon) and sort from that.2
(defun my-sort-key-value-lines (beg end) (interactive "r") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr nil 'forward-line 'end-of-line (lambda () (search-forward ":" nil nil 1) ;; returns point, so we explicitly return nil)))))
And this is the code that results in lexicographical sorting of the values, but we want numeric sorting. There are at least two ways to fix that.
First, we can use the comparison function, to compare both arguments as numbers. We just have to convert the arguments to integers, and then compare them.2
(defun my-sort-key-value-lines (beg end) (interactive "r") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr nil 'forward-line 'end-of-line (lambda () (search-forward ":" nil nil 1) nil) nil (lambda (a b) (< (string-to-number a) (string-to-number b)))))))
In the code above we do that in the second lambda. The
nil between them is the
end-of-key function, which we don’t need to define because it’s the same as the
end-of-record (represented by
'end-of-line in the above code).
The simpler method, is to do the conversion in the previous lambda, and use the default comparison function. Which results in the third revision.2
(defun my-sort-key-value-lines (beg end) (interactive "r") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr nil 'forward-line 'end-of-line (lambda () (search-forward ":" nil nil 1) (string-to-number (buffer-substring (point) (point-at-eol))))))))
You can now just drop this into your
~/.xemacs/init.el and use the command
M-x my-sort-key-value-lines to sort
key: value lines, whenever you have the need. And this leaves us with the desired numerical sort order.
bar: 2 baz: 7 foo: 12
1 This code is GPL.
2 This code can be considered WTFPL 2.0; at least the parts inside the lambdas and the rest is just boilerplate.