Tuesday, December 10, 2019

postgresql: libc collation issue, linking with ICU

hello,

i have noticed that libc collation on OpenBSD is broken (also on macos) :(

openbsd=# select n from (values ('bernard'),('bérénice'),('béatrice'),('boris')) AS l(n) order by n collate "fr_FR"
n
----------
bernard
boris
béatrice
bérénice
(4 rows)


macos=# select n from (values ('bernard'),('bérénice'),('béatrice'),('boris')) AS l(n) order by n collate "fr_FR";
n
----------
bernard
boris
béatrice
bérénice
(4 rows)


linux=# select n from (values ('bernard'),('bérénice'),('béatrice'),('boris')) AS l(n) order by n collate "fr_FR"
n
------------
béatrice
bérénice
bernard
boris
(4 rows)


postgres supports ICU and this guarantees the same results on every
platform. both macos and linux link against it, i think it would be a
good addition to the openbsd port as well...

macos=# select n from (values ('bernard'),('bérénice'),('béatrice'),('boris')) AS l(n) order by n collate "fr-FR-x-icu";
n
----------
béatrice
bérénice
bernard
boris
(4 rows)

macos=# SELECT * FROM pg_collation;
...
(861 rows)


linux=# select n from (values ('bernard'),('bérénice'),('béatrice'),('boris')) AS l(n) order by n collate "fr-FR-x-icu";
n
------------
béatrice
bérénice
bernard
boris
(4 rows)

linux=# SELECT * FROM pg_collation;
...
(1142 rows)

openbsd=# select n from (values ('bernard'),('bérénice'),('béatrice'),('boris')) AS l(n) order by n collate "fr-FR-x-icu";
ERROR: collation "fr-FR-x-icu" for encoding "UTF8" does not exist
LINE 1: ...nice'),('béatrice'),('boris')) AS l(n) order by n collate "f...
^
openbsd=# SELECT * FROM pg_collation;
...
(134 rows)


not sure if related but sort(1) is similarly confused (also on macos):

$ echo $LC_CTYPE
en_US.UTF-8
$ cat n
bernard
boris
béatrice
bérénice
$ sort n
bernard
boris
béatrice
bérénice

-f
--
history doesn't repeat itself. historians do.

No comments:

Post a Comment