XEmacs: Sorting Key-Value Lines by Value
XEmacs 21.4.24 [direct ftp download] and the latest stable release (2015) is the version I’m personally using. The directions here may well apply to GNU Emacs as well; I don’t know.
Most Emacs users are familiar with the command M-x sort-lines
which alphabetically sorts the lines highlighted in the current buffer.
However, I had the wish to sort key: values
as follows, by the value.
foo: 12 baz: 7 bar: 2
As you can see, in this instance, the value is numeric and lexicographical sorting of numbers results in
foo: 12 bar: 2 baz: 7
that is, alphabetical and not numeric.
In order to “fix this” we have to delve into the sorting internals of XEmacs.
First, let’s look at the function sort-lines, which is fairly straight forward.1
(defun sort-lines (reverse beg end) ;; [documentation string elided] (interactive "P\nr") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr reverse 'forward-line 'end-of-line))))
The most important thing to note here is the use of the function sort-subr
. The rest of the code is simply boiler plate to limit the sort to the highlighted region. The (interactive "P\nr")
is to read the start and end of the region into the argument values beg
and end
. For our purposes, we can simply treat this as “magic”; the effect of the newline embedded in the string is to separate the reverse
(read with "P"
) from the region (read with "r"
) and that means people can sort in reverse alphabetical order with C-u M-x sort-lines
. The code we’ll develop in here does not have this feature; adding it can be considered an exercise for the reader.
Alright, now we know how sort-lines
works, and we know that we can use sort-subr
to sort by values, the question is how do we do it?
First, and obviously, we read the documentation string with C-h f sort-subr
. This tells us that there are two variables we can make use of,1
STARTKEYFUN moves from the start of the record to the start of the key. It may return either a non-nil value to be used as the key, or else the key is the substring between the values of point after STARTKEYFUN and ENDKEYFUN are called. If STARTKEYFUN is nil, the key starts at the beginning of the record.
and
COMPAREFUN compares the two keys. It is called with two strings and should return true if the first is “less” than the second, just as for `sort’. If nil or omitted, the default function accepts keys that are numbers (compared numerically) or strings (compared lexicographically).
The first thing we note here is the startkeyfun
. It’ll allow us to limit the sort comparison to the value part of the lines. The trick here is to just move the point past the :
(colon). We can do that with search-forward
. Since in my case, and the example here, all the lines do have a colon, we’ll not consider the case where it might be missing in the line, hence we don’t impose any limit on the search (nil
) nor do we care about errors (we allow them with nil
); however, we limit the count to exactly one.
That leaves us with a call that looks like
(search-forward ":" nil nil 1)
and to apply that to sort-subr
we define an entirely new function, my-sort-key-value-lines
. We wrap search-forward
in a lambda for simplicity. Notice that we return nil
explicitly from the lambda, because otherwise sort-subr
will use its return value (the location of point in the buffer after the colon) and sort from that.2
(defun my-sort-key-value-lines (beg end) (interactive "r") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr nil 'forward-line 'end-of-line (lambda () (search-forward ":" nil nil 1) ;; returns point, so we explicitly return nil)))))
And this is the code that results in lexicographical sorting of the values, but we want numeric sorting. There are at least two ways to fix that.
First, we can use the comparison function, to compare both arguments as numbers. We just have to convert the arguments to integers, and then compare them.2
(defun my-sort-key-value-lines (beg end) (interactive "r") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr nil 'forward-line 'end-of-line (lambda () (search-forward ":" nil nil 1) nil) nil (lambda (a b) (< (string-to-number a) (string-to-number b)))))))
In the code above we do that in the second lambda. The nil
between them is the end-of-key
function, which we don’t need to define because it’s the same as the end-of-record
(represented by 'end-of-line
in the above code).
The simpler method, is to do the conversion in the previous lambda, and use the default comparison function. Which results in the third revision.2
(defun my-sort-key-value-lines (beg end) (interactive "r") (save-excursion (save-restriction (narrow-to-region beg end) (goto-char (point-min)) (sort-subr nil 'forward-line 'end-of-line (lambda () (search-forward ":" nil nil 1) (string-to-number (buffer-substring (point) (point-at-eol))))))))
You can now just drop this into your ~/.xemacs/init.el
and use the command M-x my-sort-key-value-lines
to sort key: value
lines, whenever you have the need. And this leaves us with the desired numerical sort order.
bar: 2 baz: 7 foo: 12
1 This code is GPL.
2 This code can be considered WTFPL 2.0; at least the parts inside the lambdas and the rest is just boilerplate.
When I first wrote this code, I was not aware of
M-x sort-numeric-fields
, described on Xah Lee’s page and mentioned in the on-line manual; however, I had spaces in my real world keys, and as of 21.4,M-x sort-numerc-fields
is hard coded to use white space as field separators. I therefore had to code my own solution anyway.