Skip to content

unicode characters from mssql to postgreSQL causes pgloader to fail #1695

@visma-petertilsted

Description

@visma-petertilsted

Hi.

hoping for some help here
the issue can be because of my lack of linux knowledge, since i am a windows guy, but here goes

we are trying to load a database from a sqlserver 2022 to a postgreSQL database.
the sqlserver database has collation 'Danish_Norwegian_CI_AS'

whenever pgloader experiences a nvarchar (unicode) column with danish letters, we get this error:

KABOOM!
TYPE-ERROR: The value
              #.(SB-SYS:INT-SAP #X7FC1BC01BD18)
            is not of type
              VECTOR
An unhandled error condition has been signalled:
   The value
     #.(SB-SYS:INT-SAP #X7FC1BC01BD18)
   is not of type
     VECTOR

i am using this load file

-- pgloader config 
LOAD DATABASE
     FROM mssql://StarTools_Deploy:[email protected]:1439/VisiteringOgStatus
     INTO postgresql://app:[email protected]:32103/app

-- Filtrer tabeller, eet schema ad gangen
INCLUDING ONLY TABLE NAMES LIKE 'AbsenceCauseType' in schema 'dbo'

 WITH include drop,
      create schemas,
      create tables,
      create indexes,
      reset sequences,
      no foreign keys,
      workers = 4

 SET
      work_mem to '16MB',
      maintenance_work_mem to '512MB',
      client_encoding to 'UTF-8'

BEFORE LOAD DO
$$
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
$$

CAST
     type datetimeoffset to timestamptz,
     type datetime       to timestamp,
     type datetime2      to timestamp,
     type money          to numeric using zero-dollars-to-null,
     type bit            to boolean,
     type smallint       to integer,
     type int to integer,
     type int with extra auto_increment to serial drop default,
     type decimal        to numeric,
     type numeric        to numeric,
     type uniqueidentifier to uuid using sql-server-uniqueidentifier-to-uuid,
     type tinyint to smallint,
     type char      to text  drop typemod,
     type nchar     to text  drop typemod,
     type varchar   to text  drop typemod,
     type nvarchar  to text  drop typemod,
     type xml       to text  drop typemod,
      type binary    to bytea using byte-vector-to-bytea,
      type varbinary to bytea using byte-vector-to-bytea
;

I am using
pgloader version "3.6.d9ca38e"
compiled with SBCL 2.1.11.debian

I can see from documentation and other issues that the conf file freetds.conf is where the setting to be able to load unicode is set by having this entry:

[global]
        tds version = 8.0
        client charset = UTF-8

and that is exactly how mine looks like in the folder ./pgloader/conf/freetds.conf

But still i get the issue, also after adding

SET
      client_encoding to 'UTF-8'

to the load file.

should the file freetds.conf be in the path (for example in /usr/local/bin/) when running pgloader, or is include in the pgloader binary when building ?

regards Peter

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions