Saturday, September 9, 2017

Determining which objects are or are not linked to the main root object

Leave a Comment

I am trying to navigate through a bunch of objects with links to other objects. I want to start with the lowest id number (the root object) and navigate through each of the objects based on connected links. Some of the object links will loop back to previous objects, so I want to make sure I look at each one only once otherwise I will get stuck in an infinite loop. I also want to be able to tell which objects cannot be accessed by navigating through the links beginning at the first link.

The tables in my database look like this:

Object Table:

+----+---------+ | id | title   | +----+---------+ |  1 | Apple   | |  3 | Carrot  | |  4 | Dill    | |  5 | Egg     | |  6 | Fred    | |  7 | Goat    | |  8 | Harry   | |  9 | Igloo   | | 10 | Jason   | | 11 | Klaus   | | 12 | Banana  | | 15 | Oyster1 | | 16 | Oyster2 | +----+---------+ 

Object_Links Table:

+----+---------+--------------+ | id |  obj_id |  obj_link_id | +----+---------+--------------+ |  1 |       1 |           12 | |  2 |       1 |            5 | |  3 |       3 |            1 | |  4 |       3 |           12 | |  5 |       3 |            3 | |  6 |       4 |            1 | |  7 |       4 |            5 | |  8 |       5 |            6 | |  9 |       6 |            7 | | 10 |       7 |            7 | | 11 |       7 |            8 | | 12 |       9 |           12 | | 13 |       9 |            5 | | 14 |      10 |            1 | | 15 |      10 |            5 | | 16 |      10 |            8 | | 17 |      11 |            1 | | 18 |      11 |            5 | | 19 |      11 |           10 | | 20 |      12 |            3 | | 21 |      15 |           16 | | 22 |      16 |           15 | +----+---------+--------------+ 

So, from the table you can see that object 1 has links to both objects 12 and 5.

My SQL query looks like this:

select  object.id, title, obj_link_id     from  object     left join  object_links  ON object.id = object_links.object_id     order by  object.id  

which gives the table:

+----+---------+--------------+ | id | title   |  obj_link_id | +----+---------+--------------+ |  1 | Apple   |           12 | |  1 | Apple   |            5 | |  3 | Carrot  |            1 | |  3 | Carrot  |           12 | |  3 | Carrot  |            3 | |  4 | Dill    |            1 | |  4 | Dill    |            5 | |  5 | Egg     |            6 | |  6 | Fred    |            7 | |  7 | Goat    |            7 | |  7 | Goat    |            8 | |  8 | Harry   |         NULL | |  9 | Igloo   |           12 | |  9 | Igloo   |            5 | | 10 | Jason   |            1 | | 10 | Jason   |            5 | | 10 | Jason   |            8 | | 11 | Klaus   |            1 | | 11 | Klaus   |            5 | | 11 | Klaus   |           10 | | 12 | Banana  |            3 | | 15 | Oyster1 |           16 | | 16 | Oyster2 |           15 | +----+---------+--------------+ 

In PHP I am using:

$objects = $stmt->fetchAll(PDO::FETCH_CLASS); 

I wasn't sure whether there was a better way to fetch these for my purposes so am open to suggestions.

A print_r($objects) yields:

Array (     [0] => stdClass Object         (             [id] => 1             [title] => Apple             [obj_link_id] => 12         )      [1] => stdClass Object         (             [id] => 1             [title] => Apple             [obj_link_id] => 5         )      [2] => stdClass Object         (             [id] => 3             [title] => Carrot             [obj_link_id] => 1         )      [3] => stdClass Object         (             [id] => 3             [title] => Carrot             [obj_link_id] => 12         )      [4] => stdClass Object         (             [id] => 3             [title] => Carrot             [obj_link_id] => 3         )      [5] => stdClass Object         (             [id] => 4             [title] => Dill             [obj_link_id] => 1         )      [6] => stdClass Object         (             [id] => 4             [title] => Dill             [obj_link_id] => 5         )      [7] => stdClass Object         (             [id] => 5             [title] => Egg             [obj_link_id] => 6         )      [8] => stdClass Object         (             [id] => 6             [title] => Fred             [obj_link_id] => 7         )      [9] => stdClass Object         (             [id] => 7             [title] => Goat             [obj_link_id] => 7         )      [10] => stdClass Object         (             [id] => 7             [title] => Goat             [obj_link_id] => 8         )      [11] => stdClass Object         (             [id] => 8             [title] => Harry             [obj_link_id] =>         )      [12] => stdClass Object         (             [id] => 9             [title] => Igloo             [obj_link_id] => 12         )      [13] => stdClass Object         (             [id] => 9             [title] => Igloo             [obj_link_id] => 5         )      [14] => stdClass Object         (             [id] => 10             [title] => Jason             [obj_link_id] => 1         )      [15] => stdClass Object         (             [id] => 10             [title] => Jason             [obj_link_id] => 5         )      [16] => stdClass Object         (             [id] => 10             [title] => Jason             [obj_link_id] => 8         )      [17] => stdClass Object         (             [id] => 11             [title] => Klaus             [obj_link_id] => 1         )      [18] => stdClass Object         (             [id] => 11             [title] => Klaus             [obj_link_id] => 5         )      [19] => stdClass Object         (             [id] => 11             [title] => Klaus             [obj_link_id] => 10         )      [20] => stdClass Object         (             [id] => 12             [title] => Banana             [obj_link_id] => 3         )      [21] => stdClass Object         (             [id] => 15             [title] => Oyster1             [obj_link_id] => 16         )      [22] => stdClass Object         (             [id] => 16             [title] => Oyster2             [obj_link_id] => 15         )  ) 

Please note, that the number in the brackets is just the array index, not the object id number, so don't let the index throw you off.

I am trying to find a way to determine which are the linked and which are the unlinked objects. Based on the above scenario the objects should be separated as follows:

**Linked:**      Apple     Banana     Carrot     Egg     Fred     Goat     Harry  **Not Linked:**      Dill     Igloo     Jason     Klaus     Oyster1     Oyster2 

My main question:

How can I create a loop in PHP to loop through a structure like this especially when each object can have multiple links? Ultimately I would like to produce two collections of objects, one containing the linked objects and one containing the unlinked objects. A sample collection might look like this:

stdClass Object (   [LinkedElements] => stdClass Object     (       [1] => stdClass Object         (           [id] => 1           [name] => Apple           [link] => Array             (               [0] => 14               [1] => 5             )          )        [14] => stdClass Object         (           [id] => 14           [name] => Banana           [link] => Array             (               [0] => 3             )          )        [3] => stdClass Object         (           [id] => 3           [name] => Carrot           [link] => Array             (               [0] => 1               [1] => 14               [2] => 3             )          )        [5] => stdClass Object         (           [id] => 5           [name] => Egg           [link] => Array             (               [0] => 6             )          )        [6] => stdClass Object         (           [id] => 6           [name] => Fred           [link] => Array             (               [0] => 7             )          )        [7] => stdClass Object         (           [id] => 7           [name] => Goat           [link] => Array             (               [0] => 7               [1] => 8             )          )        [8] => stdClass Object         (           [id] => 8           [name] => Harry         )      )    [UnLinkedElements] => stdClass Object     (       [4] => stdClass Object         (           [id] => 4           [name] => Dill           [link] => Array             (               [0] => 1               [1] => 5             )          )        [9] => stdClass Object         (           [id] => 9           [name] => Igloo           [link] => Array             (               [0] => 14               [1] => 5             )          )        [10] => stdClass Object         (           [id] => 10           [name] => Jason           [link] => Array             (               [0] => 1               [1] => 5               [2] => 8             )          )        [11] => stdClass Object         (           [id] => 11           [name] => Klaus           [link] => Array             (               [0] => 1               [1] => 5               [2] => 10             )          )        [15] => stdClass Object         (           [id] => 15           [name] => Oyster1           [link] => Array             (               [0] => 16             )          )        [16] => stdClass Object         (           [id] => 16           [name] => Oyster2           [link] => Array             (               [0] => 15             )          )      )  ) 

Please note:

  • Navigation is done from object to link, not the other way around.
  • It is okay to have an object point to itself (as object 7 does).
  • The above sample structure (underneath my main question) is a suggestion only and I am open to other suggestions.
  • Disclaimer: This question is based on another question I previously asked. In my original question I manually created the test objects, but I was not able to pull them out of my database in this fashion.

4 Answers

Answers 1

It's easier (IMHO) to deal with the data as two separate arrays. A set of objects and their links. Also, as the first part I convert the object to have the ID as the key, this allows me to use it directly rather than having to search for the ID each time.

Also to make the solution a lot simpler, I've created a flag in the object array as I visit it, so that when I try and reference it again, I can check if it's already been visited.

<?php  error_reporting ( E_ALL ); ini_set ( 'display_errors', 1 );  $objects =[[1,'apple'],         [3, 'Carrot'],         [4, 'Dill'],         [5, 'Egg '],         [6, 'Fred'],         [7, 'Goat'],         [8, 'Harry'],         [9, 'Igloo'],         [10, 'Jason'],         [11, 'Klaus'],         [12, 'Banana'],         [15, 'Oyster1'],         [16, 'Oyster2' ]]; $links =[[1,12],         [1,5],         [3,1],         [3,12],         [3,3],         [4,1],         [4,5],         [5,6],         [6,7],         [7,7],         [7,8],         [8,null],         [9,12],         [9,5],         [10,1],         [10,5],         [10,8],         [11,1],         [11,5],         [11,10],         [12,3],         [15,16],         [16,15 ]];  function buildTree ( $id, &$objects, $links )   {     foreach ( $links as $testNode )    {         if ( $testNode[0] == $id &&                  $testNode[1] != $id &&                 $testNode[1] != null &&                 !isset($objects[$testNode[1]]['visited']) )    {             $objects[$testNode[1]]['visited'] = true;             buildTree ( $testNode[1], $objects, $links);         }     } }  // Convert array to have the ID as key $objects = array_combine(array_column($objects, 0), $objects); // Fetch ID of first item reset($objects); $startID = key($objects); // Mark as visited $objects[$startID]['visited'] = true; // Process buildTree ( $startID, $objects, $links);  $linked = []; $notLinked = []; foreach ( $objects as $object)  {     if ( isset($object['visited']) )    {         $linked[] = $object[1];     }     else    {         $notLinked[] = $object[1];     } } echo "Linked...".PHP_EOL; print_r($linked);  echo "Not linked...".PHP_EOL; print_r($notLinked); 

As you can see, the core is the recursive buildTree function. This uses &$objects as this means that all calls to the function will use the same array. As I want to build up all the uses of the items, this is important.

The condition in buildTree, checks to see if it's the node we want, it's not referring to the same node (waste of time looking any more), not null (not sure why you link to null, but again not worth looking any further) and that the node hasn't already been visited. If these conditions are OK, it marks the next node as visited and goes down into the next level.

The output is...

Linked... Array (     [0] => apple     [1] => Carrot     [2] => Egg      [3] => Fred     [4] => Goat     [5] => Harry     [6] => Banana ) Not linked... Array (     [0] => Dill     [1] => Igloo     [2] => Jason     [3] => Klaus     [4] => Oyster1     [5] => Oyster2 ) 

Answers 2

This is a graph traversal problem. Starting from a node (root) you want to traverse the graph keeping track of each node visited along the way. Once the traversal is over, visited ones are connected. Breadth-first search can be done in this way:

//To form a graph fetch all objects from the database (sorted by id) and  //index them in a hash map $objects = $stmt->fetchAll(PDO::FETCH_OBJ); $nodes = []; foreach ($objects as $object) {     $nodes[$object->id] = new Node($object); }  //fetch all connections from the database and link the objects $links = $stmt->fetchAll(PDO::FETCH_OBJ); foreach ($links as $link) {     $nodes[$link->obj_id]->addLink($nodes[$link->obj_link_id]); }  //let's assume root is the first node (sorted by id),  //traverse the graph starting from root $root = reset($nodes); $root->traverse();  //now if a node can be reached by the root it is marked as visited $linked = []; $notLinked = []; foreach ($nodes as $node) {     if ($node->isVisited()) {         $linked[] = $node;     } else {         $notLinked[] = $node;     } } 

And the node class:

class Node {      /**      * List of neighbor nodes.      *      * @var Node[]      */     private $links = [];      /**      * id, title info      *      * @var array      */     private $data = [];      /**      * To track visited nodes.      *      * @var bool      */     private $visited = false;      /**      * Node constructor.      * @param array $data      */     public function __construct($data)     {         $this->data = $data;     }       /**      * Add a link to this node.      *      * @param Node $node      * @return void      */     public function addLink(Node $node)     {         $this->links[] = $node;     }      /**      * Traverse the graph in a Breadth-First-Search fashion marking      * every visited node.      * @return void      */     public function traverse()     {         //initialize queue         $q = new SplQueue();          //add root to queue and mark as visited         $q->enqueue($this);         $this->visited = true;          while (!$q->isEmpty()) {             /** @var Node $cur */             $cur = $q->dequeue();              foreach ($cur->links as $link) {                  //if link not visited already add it to queue and mark visited                 if (!$link->visited) {                     $link->visited = true;                     $q->enqueue($link);                 }             }         }     }      /**      * Checks if node has been visited.      *      * @return bool      */     public function isVisited()     {         return $this->visited;     }  } 

Answers 3

Let's say the "root" is obj_id 1.

Here's a somewhat brute force algorithm, but it takes advantage of SQL's liking of 'set' operations.

Insert into table1 the root (1) Loop     Create table2 with all nodes linked to any node in table1     Exit if number of rows in table2 = num rows in table1     table1 := table2 

Closer to SQL:

# Initialize: CREATE TABLE table1 (     obj_id ...     PRIMARY KEY(obj_id) )     SELECT 1;   # assuming root is 1  start of loop: CREATE TABLE table2 (     obj_id ...     PRIMARY KEY(obj_id) )     SELECT DISTINCT obj_link_id         FROM table1         JOIN object_links USING(obj_id);  SELECT @fini := ( SELECT COUNT(*) FROM table1 ) =                  ( SELECT COUNT(*) FROM table2 )   # will give true/false DROP TABLE table1; RENAME TABLE table2 TO table1; loop if @fini=0 

The output is all the 'connected' ids. If you want the unconnected ones:

SELECT obj_id     FROM object_links     LEFT JOIN table1 USING(obj_id)     WHERE table1.obj_id IS NULL;   # that is, missing from table1 

Answers 4

Here is a pretty short way to get all linked IDs:

$pdo = new PDO('mysql:host=localhost;dbname=test_obj_link', 'testread', 'testread');  $links = $pdo     ->query('select obj_id, obj_link_id from object_links')     ->fetchAll(PDO::FETCH_GROUP|PDO::FETCH_COLUMN);  function getLinks($objId, $links,  $indirectNodes = []) {     $directNodes = isset($links[$objId]) ? $links[$objId] : [];     foreach($directNodes as $linkedNode) {         if (!in_array($linkedNode, $indirectNodes)) {             $indirectNodes[] = $linkedNode;             $indirectNodes = array_unique(array_merge(                 $indirectNodes,                 getLinks($linkedNode, $links, $indirectNodes)             ));         }     }     return $indirectNodes; }  $linkedObjectIds = getLinks(1, $links); 

fetchAll(PDO::FETCH_GROUP|PDO::FETCH_COLUMN) will return a structured array with links per object indexed by objectIDs which looks like:

$links = [     1  => ['5', '12'],     3  => ['1', '3', '12'],     4  => ['1', '5'],     5  => ['6'],     6  => ['7'],     7  => ['7', '8'],     9  => ['5', '12'],     10 => ['1', '5', '8'],     11 => ['1', '5', '10'],     12 => ['3'],     15 => ['16'],     16 => ['15'], ]; 

The getLinksfunction will "walk" the $links array recursively and merge all child arrays that are found on the way. Since PHP doesn't seem to have an array_union function - array_unique(array_merge(..)) is used instead.

The result:

$linkedObjectIds = array (   0 => '5',   1 => '6',   2 => '7',   3 => '8',   4 => '12',   10 => '3',   11 => '1', ) 

Note that the indexes have no meaning here.

To get the corresponding objects you can do the following:

$objects = $pdo     ->query('select id, title from object')     ->fetchAll(PDO::FETCH_KEY_PAIR); $linkedObjects = array_intersect_key($objects, array_flip($linkedObjectIds)); $notLinkedObjects = array_diff_key($objects, $linkedObjects); 

The variables will contain:

$objects = array (   1 => 'Apple',   3 => 'Carrot',   4 => 'Dill',   5 => 'Egg',   6 => 'Fred',   7 => 'Goat',   8 => 'Harry',   9 => 'Igloo',   10 => 'Jason',   11 => 'Klaus',   12 => 'Banana',   15 => 'Oyster1',   16 => 'Oyster2', ); $linkedObjects = array (   1 => 'Apple',   3 => 'Carrot',   5 => 'Egg',   6 => 'Fred',   7 => 'Goat',   8 => 'Harry',   12 => 'Banana', ); $notLinkedObjects = array (   4 => 'Dill',   9 => 'Igloo',   10 => 'Jason',   11 => 'Klaus',   15 => 'Oyster1',   16 => 'Oyster2', ); 

Demo: http://rextester.com/ZQQGE35352

Note that in_array() and array_unique() are probably slow, since they have to search in values, which are not sorted. This might lead to performance issues on some datasets. Assuming that PHP can search the keys faster, we could use array_key_exists() instead of in_array() and the array operator + (union by keys) instead of array_unique(array_merge()). And the code will even be a bit shorter:

function getLinks($objId, $links,  $indirectNodes = []) {     $directNodes = isset($links[$objId]) ? $links[$objId] : [];     foreach($directNodes as $linkedNode) {         if (!array_key_exists($linkedNode, $indirectNodes)) {             $indirectNodes[$linkedNode] = 0;             $indirectNodes = $indirectNodes + getLinks($linkedNode, $links, $indirectNodes);         }     }     return $indirectNodes; } 

However - since the function now returns the needed result as keys, we would need to use array_keys() to extract them:

$linkedObjectIds = array_keys(getLinks(1, $links)); 

Demo: http://rextester.com/GHO7179

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment