Database Administrators Asked by Guiik on December 28, 2021
I am looking for the most efficient way to reference multiple tables in one table, when there is only one reference possible at a time. Which means tables A and B are referenced by table C, but both A and B cannot be referenced in a single row in C, and I choose which one I look into when I write my query.
I have thought about 4 ways of doing this :
type
and a column FK
so I can make joins like type = 'A' AND a.pk = c.a_fk
(this is what we use right now)FROM c JOIN a ON a.d_fk = c.d_fk
In every solution I tried, the query planner is wrong about how many rows will be returned.
Here is an example : I created a simple database where I have 3 tables like so :
Table "public.a"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+-------------------------------
pk | bigint | | not null | nextval('a_pk_seq'::regclass)
Table "public.b"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+-------------------------------
pk | bigint | | not null | nextval('b_pk_seq'::regclass)
Table "public.c"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+-------------------------------
pk | bigint | | not null | nextval('c_pk_seq'::regclass)
a_fk | bigint | | |
b_fk | bigint | | |
Indexes:
"c_pkey" PRIMARY KEY, btree (pk)
"c_a_fk_idx" btree (a_fk)
"c_b_fk_idx" btree (b_fk)
Foreign-key constraints:
"c_a_fk_fkey" FOREIGN KEY (a_fk) REFERENCES a(pk)
"c_b_fk_fkey" FOREIGN KEY (b_fk) REFERENCES b(pk)
So when I run this query :
SELECT * FROM a JOIN c ON a.pk = c.a_fk;
I would except the planner to think the query will return 5 rows.
Instead, this is what I got :
Merge Join (cost=1.76..1.94 rows=10000 width=32) (actual time=0.035..0.051 rows=5 loops=1)
Merge Cond: (a.pk = c.a_fk)
-> Sort (cost=1.63..1.68 rows=20 width=8) (actual time=0.024..0.028 rows=15 loops=1)
Sort Key: a.pk
Sort Method: quicksort Memory: 25kB
-> Seq Scan on a (cost=0.00..1.20 rows=20 width=8) (actual time=0.009..0.013 rows=20 loops=1)
-> Index Scan using c_a_fk_idx1 on c (cost=0.13..24.21 rows=10000 width=24) (actual time=0.006..0.012 rows=5 loops=1)
Planning time: 0.297 ms
Execution time: 0.088 ms
I tried to create a partial index CREATE UNIQUE INDEX ON c (b_fk) WHERE a_fk IS NULL
but it didn’t change anything.
The only thing that improved the plan was to add a NULL check on b_fk like this :
SELECT * FROM a JOIN c ON a.pk = c.a_fk WHERE c.a_fk IS NOT NULL;
Which gives this plan :
Merge Join (cost=1.92..6.05 rows=5 width=32) (actual time=0.031..0.046 rows=5 loops=1)
Merge Cond: (c.a_fk = a.pk)
-> Index Scan using c_a_fk_idx on c (cost=0.29..20.37 rows=5 width=24) (actual time=0.005..0.012 rows=5 loops=1)
Index Cond: (a_fk IS NOT NULL)
-> Sort (cost=1.63..1.68 rows=20 width=8) (actual time=0.022..0.024 rows=15 loops=1)
Sort Key: a.pk
Sort Method: quicksort Memory: 25kB
-> Seq Scan on a (cost=0.00..1.20 rows=20 width=8) (actual time=0.006..0.007 rows=20 loops=1)
Planning time: 0.559 ms
Execution time: 0.086 ms
But it still gets it wrong for the rows in A and I think there must be another way to help the query planner, one that doesn’t involve adding a “useless” NULL check for every column that should be NULL.
I tried on both Postgres 9.6 and 11, I would prefer a solution for 9.6 because this is what we use but I would accept a solution for 11.
Here is a real life example using the “type” solution :
SELECT *
FROM BinDescription b
JOIN RobotDescription r
ON b.physicalContainerType = 'ROBOT'
AND b.physicalContainerFK = r.robotPK
JOIN RobotMissionDescription rm
ON rm.robotFK = r.robotPK
;
Which gives me this plan :
Merge Join (cost=225.12..395.31 rows=12026 width=301) (actual time=4.372..67.206 rows=47293 loops=1)
Merge Cond: (b.physicalcontainerfk = r.robotpk)
-> Nested Loop (cost=222.91..15184.24 rows=88 width=152) (actual time=4.321..44.448 rows=47293 loops=1)
-> Index Scan using bindescription_physicalcontainertype_physicalcontainerfk_key on bindescription b (cost=0.29..13.66 rows=4 width=76) (actual time=0.011..0.017 rows=4 loops=1)
Index Cond: (physicalcontainertype = 'ROBOT'::physicalcontainertype)
-> Bitmap Heap Scan on robotmissiondescription rm (cost=222.62..3672.39 rows=12026 width=76) (actual time=2.431..7.985 rows=11823 loops=4)
Recheck Cond: (robotfk = b.physicalcontainerfk)
Heap Blocks: exact=14246
-> Bitmap Index Scan on idx_robotmissiondescription_robotfk (cost=0.00..219.62 rows=12026 width=0) (actual time=1.629..1.629 rows=11823 loops=4)
Index Cond: (robotfk = b.physicalcontainerfk)
-> Sort (cost=2.20..2.29 rows=34 width=149) (actual time=0.044..4.752 rows=47321 loops=1)
Sort Key: r.robotpk
Sort Method: quicksort Memory: 34kB
-> Seq Scan on robotdescription r (cost=0.00..1.34 rows=34 width=149) (actual time=0.007..0.014 rows=34 loops=1)
Planning time: 1.559 ms
Execution time: 69.480 ms
Both A and B inherit a sequence from a "parent" table and C has a foreign key to the "parent" table, but this is not possible with PostgreSQL
This is possible with any SQL database. Why do you think it's not possible with PostgreSQL?
You're probably confusing SQL Table Inheritance with PostgreSQL's "table inheritance", which are different things
Here's how to do it in SQL. This is called Class Table Inheritance:
create table parties (
party_id serial primary key
);
create table individuals (
party_id int primary key references parties(party_id),
given_name text,
...
);
create table organizations (
party_id int primary key references parties(party_id),
organization_name text,
...
);
create table sales_orders (
order_id serial primary key,
customer_id int references parties(party_id),
...
);
If you want faster performance, use Single Table Inheritance:
create table parties (
party_id serial primary key,
party_type text check(party_type in ('Individual','Organization')),
given_name text null,
organization_name text null,
...
/* add some check constraints for party_type vs values for individuals/orgs */
);
Also, it'd be about 100x easier to help you if you gave real examples instead of abstract "A" and "B"
Answered by Neil McGuigan on December 28, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP