Tag Archives: lisp

XEmacs: Sorting Key-Value Lines by Value

XEmacs 21.4.24 [direct ftp download] and the latest stable release (2015) is the version I’m personally using. The directions here may well apply to GNU Emacs as well; I don’t know.

Most Emacs users are familiar with the command M-x sort-lines which alphabetically sorts the lines highlighted in the current buffer.

However, I had the wish to sort key: values as follows, by the value.

foo: 12
baz: 7
bar: 2

As you can see, in this instance, the value is numeric and lexicographical sorting of numbers results in

foo: 12
bar: 2
baz: 7

that is, alphabetical and not numeric.

In order to “fix this” we have to delve into the sorting internals of XEmacs.

First, let’s look at the function sort-lines, which is fairly straight forward.1

(defun sort-lines (reverse beg end)
  ;; [documentation string elided]
  (interactive "P\nr")
  (save-excursion
    (save-restriction
      (narrow-to-region beg end)
      (goto-char (point-min))
      (sort-subr reverse 'forward-line 'end-of-line))))

The most important thing to note here is the use of the function sort-subr. The rest of the code is simply boiler plate to limit the sort to the highlighted region. The (interactive "P\nr") is to read the start and end of the region into the argument values beg and end. For our purposes, we can simply treat this as “magic”; the effect of the newline embedded in the string is to separate the reverse (read with "P") from the region (read with "r") and that means people can sort in reverse alphabetical order with C-u M-x sort-lines. The code we’ll develop in here does not have this feature; adding it can be considered an exercise for the reader.

Alright, now we know how sort-lines works, and we know that we can use sort-subr to sort by values, the question is how do we do it?

First, and obviously, we read the documentation string with C-h f sort-subr. This tells us that there are two variables we can make use of,1

STARTKEYFUN moves from the start of the record to the start of the key. It may return either a non-nil value to be used as the key, or else the key is the substring between the values of point after STARTKEYFUN and ENDKEYFUN are called. If STARTKEYFUN is nil, the key starts at the beginning of the record.

and

COMPAREFUN compares the two keys. It is called with two strings and should return true if the first is “less” than the second, just as for `sort’. If nil or omitted, the default function accepts keys that are numbers (compared numerically) or strings (compared lexicographically).

The first thing we note here is the startkeyfun. It’ll allow us to limit the sort comparison to the value part of the lines. The trick here is to just move the point past the : (colon). We can do that with search-forward. Since in my case, and the example here, all the lines do have a colon, we’ll not consider the case where it might be missing in the line, hence we don’t impose any limit on the search (nil) nor do we care about errors (we allow them with nil); however, we limit the count to exactly one.

That leaves us with a call that looks like

  (search-forward ":" nil nil 1)

and to apply that to sort-subr we define an entirely new function, my-sort-key-value-lines. We wrap search-forward in a lambda for simplicity. Notice that we return nil explicitly from the lambda, because otherwise sort-subr will use its return value (the location of point in the buffer after the colon) and sort from that.2

(defun my-sort-key-value-lines (beg end)
  (interactive "r")
  (save-excursion
    (save-restriction
      (narrow-to-region beg end)
      (goto-char (point-min))
      (sort-subr nil 'forward-line 'end-of-line
                 (lambda ()
                   (search-forward ":" nil nil 1)
                   ;; returns point, so we explicitly return
                   nil)))))

And this is the code that results in lexicographical sorting of the values, but we want numeric sorting. There are at least two ways to fix that.

First, we can use the comparison function, to compare both arguments as numbers. We just have to convert the arguments to integers, and then compare them.2

(defun my-sort-key-value-lines (beg end)
  (interactive "r")
  (save-excursion
    (save-restriction
      (narrow-to-region beg end)
      (goto-char (point-min))
      (sort-subr nil 'forward-line 'end-of-line
		 (lambda ()
		   (search-forward ":" nil nil 1)
		   nil)
		 nil
		 (lambda (a b)
		   (< (string-to-number a)
		      (string-to-number b)))))))

In the code above we do that in the second lambda. The nil between them is the end-of-key function, which we don’t need to define because it’s the same as the end-of-record (represented by 'end-of-line in the above code).

The simpler method, is to do the conversion in the previous lambda, and use the default comparison function. Which results in the third revision.2

(defun my-sort-key-value-lines (beg end)
  (interactive "r")
  (save-excursion
    (save-restriction
      (narrow-to-region beg end)
      (goto-char (point-min))
      (sort-subr nil 'forward-line 'end-of-line
                 (lambda ()
                   (search-forward ":" nil nil 1)
                   (string-to-number (buffer-substring (point) (point-at-eol))))))))

You can now just drop this into your ~/.xemacs/init.el and use the command M-x my-sort-key-value-lines to sort key: value lines, whenever you have the need. And this leaves us with the desired numerical sort order.

bar: 2
baz: 7
foo: 12

1 This code is GPL.

2 This code can be considered WTFPL 2.0; at least the parts inside the lambdas and the rest is just boilerplate.

SBCL: with-timeout Is a Nice Undocumented Feature

SBCL has a nice undocumented feature, (sb-ext:with-timeout expires &body body). As far as I can tell, this is exactly analogous to the same feature in Bordeaux Threads, but that seems undocumented too.

It does not appear in the SBCL manual as of 1.3.14, but it does have a documentation string.

"Execute the body, asynchronously interrupting it and signalling a TIMEOUT
condition after at least EXPIRES seconds have passed.

Note that it is never safe to unwind from an asynchronous condition. Consider:

  (defun call-with-foo (function)
    (let (foo)
      (unwind-protect
         (progn
           (setf foo (get-foo))
           (funcall function foo))
       (when foo
         (release-foo foo)))))

If TIMEOUT occurs after GET-FOO has executed, but before the assignment, then
RELEASE-FOO will be missed. While individual sites like this can be made proof
against asynchronous unwinds, this doesn't solve the fundamental issue, as all
the frames potentially unwound through need to be proofed, which includes both
system and application code -- and in essence proofing everything will make
the system uninterruptible."

Here is a little demonstration on how to use it.

(handler-case
    (sb-ext:with-timeout 3
      (format t "Hello, world.~%")
      (sleep 5)
      (format t "Goodbye, world.~%"))
  (sb-ext:timeout (e)
    (format t "~a~%" e)))

This will print out

Hello, world.
Timeout occurred.

Enjoy.

SBCL: Testsuites Cannot Prevent All Possible Bugs

On OS X, SBCL as of 1.3.14 can’t sleep after fork. The following simple program exits with an error.

(require 'sb-posix)

(let ((pid (sb-posix:fork)))
  (if (= 0 pid)
      (progn
        (format t "Child: Sleeping for 10 seconds.~%")
        (sleep 10)
        (format t "Child: I woke up.~%"))
    (format t "Parent, exiting.~%")))

When the above script is run on OS X, it fails with the weird error we see below. Note that it works perfectly on Linux.

% sbcl --script sleep.cl
Parent, exiting.
Child: Sleeping for 10 seconds.
fatal error encountered in SBCL pid 16145:
(ipc/send) invalid destination port

The point of this, is that there is no reasonable way a testsuite will catch this kind of a bug. Testsuites, no matter how comprehensive, will never prevent bugs 100%. At most, they prevent the same bug from reappearing.

Hopefully the SBCL team will fix this bug soon.

Building SBCL on OS X Yosemite

A month or two ago, it did not work to build Steel Banks Common Lisp on OS X Yosemite. Or at least it never worked for me.

This has been fixed now, at least as of recent git checkout, and quite possibly SBCL 1.3.10.

If you get an error like the following,

ld: library not found for -lgcc_s.10.4

then you can try this (as far as I know) totally undocumented switch

SBCL_MACOSX_VERSION_MIN=10.10 sh make.sh

and the build will succeed.

Happy lisping with SBCL!

When SBCL Is Buggy, and CFFI Is Undocumented

There are at least two good ways to create C strings (or alien strings) in Lisp. The most often used is CFFI‘s foreign-string-alloc and the other is SBCL‘s make-alien-string.

The SBCL routine make-alien-string is documented to return both the alien pointer and the length of the string. However, it doesn’t.

Today, I reported this bug so by the time you read this the following may actually work; but as of SBCL 1.3.9 it doesn’t.

  (multiple-value-bind (buffer length)
      (make-alien-string "foo")
    (format t "buffer: ~a~%length: ~a~%" buffer length))

And this will print something like

  buffer: #<SB-ALIEN-INTERNALS:ALIEN-VALUE :SAP #X00400190
                                           :TYPE (* (SB-ALIEN:SIGNED 8))>
  length: NIL

On the other hand, the CFFI routine foreign-string-alloc is not documented (as of this writing) to return an extra length value, but actually does.

  (multiple-value-bind (buffer length)
      (cffi:foreign-string-alloc "foo")
    (format t "buffer: ~a~%length: ~a~%" buffer length))

Which will print something like

  buffer: #.(SB-SYS:INT-SAP #X00600050)
  length: 4

Note that the result is by default zero terminated, and hence the four bytes.

Hopefully the CFFI documentation will be updated just as quickly as SBCL is patched.

Have fun, and enjoy the Lisp world because it’s full of weird stuff.

PostgreSQL: Load JSON with Lisp and Postmodern

Sometimes we get JSON objects that are not immediately loadable with the usual PostgreSQL tools. Notably, at the time of this writing, there doesn’t seem to be any special JSON support in pgloader.

In particular, it’s frequent enough to get an array of JSON objects from a webserver that needs to be loaded into a database; for whatever reason, that I am presenting my tool for it.

An array of JSON looks like this,

  [ { "foo": "bar" }, { "foo": "qux" } ]

without the whitespace, and is usually given on a single line. For this kind of data, using COPY which expects each row to be a single line, obivously does not work; and the array is an impediment also.

Enter jsown, one of several JSON parsers for Common Lisp. It was chosen for this topic because it’s reputed to be the best for decoding; however, we are also re-creating each JSON object, so so Jonathan might also be appropriate.

First, to get the data — we can do this directly with Drakma in some cases, and others we load it from a text file. When Drakma returns an octet sequence, we can do this

  (let ((json-array
         (jsown:parse
          (sb-ext:octets-to-string
           (drakma:http-request "http://example.com/some.json")))))

for example, and there are other ways to decode strings without relying on the SBCL implementation.

For the insertion itself, we use Postmodern.

When we want to insert a subset of the data, into table columns, we can loop over the JSON objects and collect the values we want.

    (loop :for json :in json-array ;; [*]
          :collect (list (jsown:val json "foo")
                         (jsown:val json "bar")))

This creates a list of lists — the outer list returned are the rows in our table and the inner list is the row itself — split into columns.

Then we insert that data with

    (postmodern:with-transaction (inserting-json-data) ;; the tx name
      (postmodern:execute (:insert-rows-into 'table
                           :columns 'foo 'bar
                           :values loop-list))) ;; see [*].

This way, we get rid of a tedious insertion loop; which is handy. The `with-transaction’ form automatically commits at the end, we only need explicit rollbacks if desired.

On the other hand, if we want to insert the JSON object itself, into the database we have to recreate it.

    (loop :for json :in json-array ;; [**]
          :collect (list (jsown:to-json json)))

And using the same syntax as above, we insert with

    (postmodern:with-transaction (inserting-json-data) ;; the tx name
      (postmodern:execute (:insert-rows-into 'table
                           :columns 'jsonb-column-name
                           :values loop-list))) ;; see [**].

The above method really assumes you’re going to insert into more than one column, with some values possibly taken from inside the JSON object.

The Postmodern syntax for inserting multiple rows is really handy to get rid of a pointless insertion loop but it has the overhead of requiring a list of all the objects in memory.

That’s all folks.

PostgreSQL’s ~/.pgpass Parser in Common Lisp

Here is a quick and dirty PostgreSQL ~/.pgpass parser, in Common Lisp.

It depends on Split Sequence, conveniently loaded with QuickLisp, like so.

(ql:quickload :split-sequence :silent t)

And here is the actual code. It’s not wrapped in a function, and the other connection parameters are hard coded. This can use some refactoring; particularly with relevance to error handling.

(defvar pguser "user")
(defvar pgdb "database")
(defvar pghost "hostname")
(defvar pgport 5432)

(defvar pgpass
  (handler-case
      (with-open-file 
          (.pgpass "~/.pgpass"
		   :direction :input
		   :if-does-not-exist nil)
	(flet ((wildcard-string= (a b)
		 (or (string= a "*")
		     (string= a (princ-to-string b)))))

	  (loop for line = (read-line .pgpass nil)
	    while line
	    when (destructuring-bind (host port db user passwd)
		   (split-sequence:split-sequence #\: line)
		   (if (and (wildcard-string= host pghost)
			    (wildcard-string= port pgport)
			    (wildcard-string= db pgdb)
			    (wildcard-string= user pguser))
		       passwd))
	    return it)))
    (file-error (e)
		(format *error-output* "Unable to load password file; ~A~%"
			e)
		nil)))

(format t "Password: ~A~%" pgpass)

Licenses

Split-Sequence is part of CL-Utilities and Public Domain.

QuickLisp is licensed under an MIT-style license.

The example code above is © 2015 Johann ‘Myrkraverk’ Oskarsson and licensed under the Two-clause BSD license.