Monthly Archives: August 2016

PostgreSQL: Load JSON with Lisp and Postmodern

Sometimes we get JSON objects that are not immediately loadable with the usual PostgreSQL tools. Notably, at the time of this writing, there doesn’t seem to be any special JSON support in pgloader.

In particular, it’s frequent enough to get an array of JSON objects from a webserver that needs to be loaded into a database; for whatever reason, that I am presenting my tool for it.

An array of JSON looks like this,

  [ { "foo": "bar" }, { "foo": "qux" } ]

without the whitespace, and is usually given on a single line. For this kind of data, using COPY which expects each row to be a single line, obivously does not work; and the array is an impediment also.

Enter jsown, one of several JSON parsers for Common Lisp. It was chosen for this topic because it’s reputed to be the best for decoding; however, we are also re-creating each JSON object, so so Jonathan might also be appropriate.

First, to get the data — we can do this directly with Drakma in some cases, and others we load it from a text file. When Drakma returns an octet sequence, we can do this

  (let ((json-array
         (jsown:parse
          (sb-ext:octets-to-string
           (drakma:http-request "http://example.com/some.json")))))

for example, and there are other ways to decode strings without relying on the SBCL implementation.

For the insertion itself, we use Postmodern.

When we want to insert a subset of the data, into table columns, we can loop over the JSON objects and collect the values we want.

    (loop :for json :in json-array ;; [*]
          :collect (list (jsown:val json "foo")
                         (jsown:val json "bar")))

This creates a list of lists — the outer list returned are the rows in our table and the inner list is the row itself — split into columns.

Then we insert that data with

    (postmodern:with-transaction (inserting-json-data) ;; the tx name
      (postmodern:execute (:insert-rows-into 'table
                           :columns 'foo 'bar
                           :values loop-list))) ;; see [*].

This way, we get rid of a tedious insertion loop; which is handy. The `with-transaction’ form automatically commits at the end, we only need explicit rollbacks if desired.

On the other hand, if we want to insert the JSON object itself, into the database we have to recreate it.

    (loop :for json :in json-array ;; [**]
          :collect (list (jsown:to-json json)))

And using the same syntax as above, we insert with

    (postmodern:with-transaction (inserting-json-data) ;; the tx name
      (postmodern:execute (:insert-rows-into 'table
                           :columns 'jsonb-column-name
                           :values loop-list))) ;; see [**].

The above method really assumes you’re going to insert into more than one column, with some values possibly taken from inside the JSON object.

The Postmodern syntax for inserting multiple rows is really handy to get rid of a pointless insertion loop but it has the overhead of requiring a list of all the objects in memory.

That’s all folks.