Friday, October 27, 2017

mysql - Keeping unique records after removing parent child relationship

Leave a Comment

In a mysql database, I have a business Unit table which maintain the hierarchy of a client's business units. Each business unit can have a parent and/or a child.

products_client_1.business_units

id  parent_id 1    2   1 3   1 4   1 8   1 14  3 17  2 31  1 35  4 36  1 37  4 38  2 39  31 40  8 41  3 42  31 43   44  43 

Currently, I have a customerId table which maintains the customerId at a business unit level

contacts_client_1.buid_customer_id

global_id   customer_id bu_id    ABC1000033  1812130     2 ABC1000033  1812130     54 ABC1000034  4049809     2 ABC1000035  5630631     2 ABC1000082  5707052     2 ABC1000082  1111116     54 ABC1000091  5813085     2 ABC1000091  5813085     54 ABC1000093  5208477     2 ABC1000115  5045891     2 ABC1000115  5045891     54 ABC1000117  6114245     2 ABC1000117  6114247     54 ABC1000117  6114247     1 ABC1000111  1234567     38 ABC1000100  9023456     43 ABC1000100  9023457     44 

Going forward, I do not want to maintain the customer id at individual business unit level. It should be unique for a given globalId. For this I want to migrate the existing customer id data based on the following condition.

If a globalId has customerId for a only single BU, migrate it as it is without bu_id.

If a globalId has customerId for 2 BUs (they can be parent-child at any level), keep the customerId of the parent most available BU.

required table contacts_client_1.customer_id

global_id   customer_id ABC1000033  1812130 ABC1000034  4049809 ABC1000035  5630631 ABC1000082  5707052 ABC1000091  5813085 ABC1000093  5208477 ABC1000100  9023456 ABC1000111  1234567 ABC1000115  5045891 ABC1000117  6114247 

PS: globalId are not overlapping among different parent most BUs.

business_unit table is under products_client_1 schema and buid_customer_id table is under contacts_client_1 schema.

The same code should be executable for different clients.

This is a one time migration.

Need help in writing the query.

1 Answers

Answers 1

I'm not sure what exactly you are going to do with your data, but the following should help:

Show only rows which have no parent for the same global_id in the buid_customer_id table:

select child.* from contacts_client_1.buid_customer_id child left join products_client_1.business_units bu   on bu.id = child.bu_id left join contacts_client_1.buid_customer_id parent   on  parent.global_id = child.global_id   and parent.bu_id     = bu.parent_id where parent.global_id is null 

Examples:

  • Row (ABC1000100 9023456 43) - The bu_id (43) has no parent in buid_customer_id, so there will be no match for the first LEFT JOIN and also no match for the second. Since all columns from the left joined tables will be NULL, parent.global_id is null is TRUE and the row will be selected.
  • Row (ABC1000100 9023457 44) - The bu_id (44) has a parent_id (43), so the first JOIN will find a match. The second JOIN will also find a match, because a row with the parent BU and the same global_id exists in the buid_customer_id table. Thus parent.global_id is not NULL and the row won't be selected.
  • Row (ABC1000033 1812130 2) - The bu_id (2) has a parent_id (1). The first JOIN will find a match. But tere is no row in the buid_customer_id table with bu_id = 1 and global_id = ABC1000033, so there is no match for the second JOIN. Thus parent.global_id will be NULL and the row will be selected.

Now you can use this statement to copy (migrate) the data to a new table with

insert into new_table     select child.*     [..] 

You can also go the other way. If you replace the LEFT JOINs with INNER JOINs and remove the WHERE clause, you will get the opposite result (all rows which are not returned by the first query). You can use it to remove all those rows from the table.

Delete all rows which have a parent row for the same global_id:

delete child from contacts_client_1.buid_customer_id child join products_client_1.business_units bu   on bu.id = child.bu_id join contacts_client_1.buid_customer_id parent   on  parent.global_id = child.global_id   and parent.bu_id     = bu.parent_id 

Now the table buid_customer_id will contain the same rows which are selected by the first query. If this data needs to be in another table - just rename it. Then you can copy global_id and customer_id with

insert into customer_id (global_id, customer_id)     select global_id, customer_id     from new_table 
If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment