Loading a Huge Textfile into PostgreSQL 11 on Windows 10

On Windows and as of 11.5, PostgreSQL has a bug when trying to copy from a huge file. According to stack overflow, this does not manifest on earlier releases like PostgreSQL 10.

For example, given a table

create table pwned( passwd sha1 not null, count int4 not null );

and when trying to copy from the pwned password list which is about 24GB uncompressed,

copy pwned( passwd, count ) from 'c:\src\postgres\pwned\pwned-passwords-sha1-ordered-by-hash-v5.txt' with ( format text, delimiter ':' );

fails and the message follows.

ERROR:  could not stat file "c:\src\postgres\pwned\pwned-passwords-sha1-ordered-by-hash-v5.txt": Unknown error

To work around this, we can pipe the file through a command;

copy pwned( passwd, count ) from program 'cmd /c "type c:\src\postgres\pwned\pwned-passwords-sha1-ordered-by-hash-v5.txt"' with ( format text, delimiter ':' );

which results in

COPY 555278657

and we can happily create an index and query the table.

The trick here is that we run cmd in sigle command mode with the /c flag, and tell it to type out the file we want to copy from.

Bug Report

Update. There is now a bug report.

Contact

The author can be reached at johann@myrkraverk.com.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: