Saturday, August 12, 2017

graph structure, find parent using stored procedure, postgres

Leave a Comment

I am new to stored procedures. As part of our application, we have a database with one of the tables having child-parent relationships. Given an id, we should be able to figure out the final parent of the id, traversing through the child-parent links in the table.

sample data

Input1: 10943, Output1: 8656

Input2: 5005, Output2: 9151, 9160

Different possibilities

An id could have multiple final parents, An id may not have a parent

Any inputs would be appreciated. Thanks in advance.

2 Answers

Answers 1

TREE STRUCTURE ANSWER:

It all depends on how is the flow of your data, I mean, how your childs and parents behave in time (inserts, updates) and how big could be the table (hundreds,thousands).

We can summarize the cases in two: big table (thousands+ rows) and small table (hundreds- rows). In any case, you could use a "result" table to hold the first parent of all the children.

SMALL TABLE

If your table is not going to be a big one, then its ok to put all in a PL. and call the PL when you want to materialize the "result" table.

BIG TABLE

If your table is going to be big (really big), then you would have to use triggers to update the "result" table. And this "result" would have to be an Eager Materialized View. In order for it to work fluently and not having to wait minutes for every transaction.

Here's an example on how you could do it in case of a Small Table, if it's big, it would be similar but you would have to work hard on the triggers to make it work OK. This example shows the PL in a DO format only for explanatory purposes, you could change that for a PL easily:

-- YOUR EXAMPLE TABLE, without child with two parents CREATE TEMP TABLE demo (child integer,parent integer); INSERT INTO demo VALUES (10943,6766), (6766,9003), (9003,8656), (5005,6995), (6995,9151), (6996,9160);  -- DO, for inline PL DO $$ DECLARE -- Variables, note that for variable asignation its better if you use ':='     fathers integer[] := NULL;     father integer := 0;     firstfather integer := 0;     no_father boolean := 'f';     sons integer[] := NULL;     son integer := 0;     row integer := 1;     rows_length integer := 0; BEGIN -- Create "result" table for the example, the drop is in case you test many times the code     DROP TABLE IF EXISTS result;     CREATE TEMP TABLE result (sons integer,fathers integer);     SELECT array(SELECT child FROM demo)     INTO sons;     rows_length := array_length(sons,1); -- Code that gets the first father of a child     LOOP         RAISE NOTICE 'son: %',sons[row];         son = sons[row];         LOOP             father := (SELECT parent FROM demo WHERE child=son);             IF father IS NULL             THEN                 no_father='f';             ELSE                 no_father='t';                 fathers[row] := father;                 son = father;             END IF;             EXIT WHEN no_father='f';         END LOOP;             RAISE NOTICE 'father: %',fathers[row]; -- INSERT in "result" son with its first father         INSERT into result(sons,fathers) values(sons[row],fathers[row]);         row := row+1;         EXIT WHEN rows_length < row;     END LOOP; END $$; -- Display "result" SELECT * FROM result; 

This code generate the next table:

Sons    Fathers --------------- 10943   8656 6766    8656 9003    8656 5005    9151 6995    9151 6996    9160 

NOTE: Your table shows a child with two fathers (6995), that's not possible in a tree structure, this example works in a tree structure where every child has one parent.

Answers 2

Using Common Table Expressions (CTE) make it much easier to write this as a query instead of as a procedure.

Let say we have this table al_test:

+-----------+----------+ | parent_id | child_id | +-----------+----------+ | 10943     | 6766     | +-----------+----------+ | 6766      | 9003     | +-----------+----------+ | 9003      | 8656     | +-----------+----------+ | 5005      | 6995     | +-----------+----------+ | 6995      | 9151     | +-----------+----------+ | 6995      | 9160     | +-----------+----------+ 

For example (PostgreSQL 8.4+), to get all parents for node 5005:

WITH RECURSIVE al_tree AS (   SELECT parent_id, child_id, 1 as depth   FROM al_test WHERE child_id = 5005    UNION ALL    SELECT al_test.parent_id, al_test.child_id, al_tree.depth + 1   FROM al_test, al_tree     WHERE al_test.child_id = al_tree.parent_id )  SELECT parent_id FROM al_tree     WHERE depth = (SELECT MAX(depth) FROM al_tree); 

Result:

+-----------+ | parent_id | +-----------+ | 9151      | +-----------+ | 9160      | +-----------+ 
If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment