Database schema assessment for pgEdge Platform compatibility

pgEdge Team performs a schema assessment of the Databases and schemas that would be part of Multi-master replication using pgEdge Platform.

Transitioning from a single Primary PostgreSQL server to a Multi-master pgEdge platform involves a significant shift in how data is managed and distributed across your database infrastructure. A crucial step in this transition is performing a comprehensive schema assessment. This assessment ensures that your database schema is compatible with and optimized for the distributed nature of a multi-master environment. Factors such as conflict resolution strategies, data type compatibility, indexing, and partitioning require careful consideration to leverage the full potential of a multi-master setup.

Moreover, assessing your schema helps in identifying potential challenges and opportunities for optimization, ensuring a smooth and efficient migration to the pgEdge platform. This preparatory step is essential for organizations aiming to scale their database systems, improve availability, and enhance data resilience without compromising on consistency and performance.

pgEdge Team performs a schema assessment of the Databases and schemas that would be part of Multi-master replication using pgEdge Platform.

Prerequisite: We request the Customer Database team to send us the "schema-only" dump of the database in plain SQL format.

The main assessment areas we look at are:

a) pgEdge Platform Compatibility with current schema
b) Data Types and Sizes
c) Indexing
d) Database Extensions
e) Database Sequences

Following are the SQLs we use to perform the above steps:

Find Missing Indexes

CREATE OR REPLACE FUNCTION public.missing_fk_indexes ()
returns table (
referencing_table regclass,
fk_columns varchar,
table_size varchar,
fk_constraint name,
referenced_table regclass
)
language sql as $$
select
-- referencing table having ta foreign key declaration
tc.conrelid::regclass as referencing_table,

-- ordered list of foreign key columns
string_agg(ta.attname, ', ' order by tx.n) as fk_columns,

-- referencing table size
pg_catalog.pg_size_pretty (
pg_catalog.pg_relation_size(tc.conrelid)
) as table_size,

-- name of the foreign key constraint
tc.conname as fk_constraint,

-- name of the target or destination table
tc.confrelid::regclass as referenced_table

from pg_catalog.pg_constraint tc

-- enumerated key column numbers per foreign key
cross join lateral unnest(tc.conkey) with ordinality as tx(attnum, n)

-- name for each key column
join pg_catalog.pg_attribute ta on ta.attnum = tx.attnum and ta.attrelid = tc.conrelid

where not exists (
-- is there ta matching index for the constraint?
select 1 from pg_catalog.pg_index i
where
i.indrelid = tc.conrelid and
-- the first index columns must be the same as the key columns, but order doesn't matter
(i.indkey::smallint[])[0:cardinality(tc.conkey)-1] @> tc.conkey) and
tc.contype = 'f'
group by
tc.conrelid,
tc.conname,
tc.confrelid
order by
pg_catalog.pg_relation_size(tc.conrelid) desc
$$;

select * from missing_fk_indexes();

Find data types used in the database

SELECT data_type, count(*)
FROM information_schema.columns
WHERE table_schema NOT IN ('information_schema','pg_catalog','pg_toast')
GROUP BY data_type ;

Find tables without PRIMARY KEY

select tab.table_schema,
tab.table_name
from information_schema.tables tab
left join information_schema.table_constraints tco
on tab.table_schema = tco.table_schema
and tab.table_name = tco.table_name
and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
and tab.table_schema not in ('pg_catalog', 'information_schema')
and tco.constraint_name is null
order by table_schema,
table_name;

Find SEQUENCES
SELECT t.oid::regclass AS table_name,
a.attname AS column_name,
s.relname AS sequence_name
FROM pg_class AS t
JOIN pg_attribute AS a
ON a.attrelid = t.oid
JOIN pg_depend AS d
ON d.refobjid = t.oid
AND d.refobjsubid = a.attnum
JOIN pg_class AS s
ON s.oid = d.objid
WHERE d.classid = 'pg_catalog.pg_class'::regclass
AND d.refclassid = 'pg_catalog.pg_class'::regclass
AND d.deptype IN ('i', 'a')
AND t.relkind IN ('r', 'P')
AND s.relkind = 'S';

Database Sequences:
When the database has Sequences created, we recommend using snowflake sequences for the Multi-master replication. You can read more about the snowflake sequences in our Knowledge base article: https://knowledgebase.pgedge.com/getting-to-know-snowflake-extension