I have a program to pass a full string of groups a user in OpenLDAP to Postgres query. The string is exactly like this:
( 'cn=user1,ou=org1,ou=suborg1,o=myorg','cn=user2,ou=org2,ou=suborg1,o=myorg','cn=user3,ou=org1,ou=suborg1,o=myorg','cn=user4,ou=org1,ou=suborg2,o=myorg' ) In a query, I only want that to be this in Postgres:
'user1','user3' Basically extract value of cn= when the rest of the string is ou=org1,ou=suborg1,o=myorg.
user2 has ou=org2,ou=suborg1,o=myorg which is org2 so it won't match. user4 won't match on suborg2 ,... The variation is unlimited so I like to look for exact match ou=org1,ou=suborg1,o=myorg only.
I know how to do replace but it can't handle unlimited scenarios. Is there a clean way to do this in regexp_replace or regexp_extract?
3 Answers
Answers 1
Probably the cleanest is by using SUBSTRING that can return just the captured substring:
SELECT SUBSTRING(strs FROM 'cn=([^,]+),ou=org1,ou=suborg1,o=myorg') FROM tb1; Here, you match cn=, then capture into Group 1 any one or more chars other than , with the negated bracket expression [^,]+ and then match ,ou=org1,ou=suborg1,o=myorg to make sure there is your required right-hand context.
Else, you may try a REGEXP_REPLACE approach, but it will leave the values where no match is found intact:
SELECT REGEXP_REPLACE(strs, '.*cn=([^,]+),ou=org1,ou=suborg1,o=myorg.*', '\1') from tb1; It matches any 0+ chars with .*, then cn=, again captures the non-comma chars into Group 1 and then matches ,ou=org1,ou=suborg1,o=myorg and 0+ chars to the end of the string.
See an online PostgreSQL demo:
CREATE TABLE tb1 (strs character varying) ; INSERT INTO tb1 (strs) VALUES ('cn=user1,ou=org1,ou=suborg1,o=myorg'), ('cn=user2,ou=org2,ou=suborg1,o=myorg'), ('cn=user3,ou=org1,ou=suborg1,o=myorg'), ('cn=user4,ou=org1,ou=suborg2,o=myorg') ; SELECT REGEXP_REPLACE(strs, '.*cn=([^,]+),ou=org1,ou=suborg1,o=myorg.*', '\1') from tb1; SELECT substring(strs from 'cn=([^,]+),ou=org1,ou=suborg1,o=myorg') from tb1; Results:
Note you may leverage a very useful word boundary \y construct (see Table 9.20. Regular Expression Constraint Escapes) if you do not want to match ocn= with cn=,
'.*\ycn=([^,]+),ou=org1,ou=suborg1,o=myorg\y.*' ^^ ^^ Answers 2
You can use regexp_matches() to get all matching cn. Then use string_agg() to build a comma separated list of them.
SELECT string_agg(ldap.cn[1], ',') cn FROM regexp_matches('( ''cn=user1,ou=org1,ou=suborg1,o=myorg'',''cn=user2,ou=org2,ou=suborg1,o=myorg'',''cn=user3,ou=org1,ou=suborg1,o=myorg'',''cn=user4,ou=org1,ou=suborg2,o=myorg'' )', '''cn=([^,]*),ou=org1,ou=suborg1,o=myorg''', 'g') ldap(cn); Answers 3
Try regex: (?<=cn=)\w+(?=,ou=org1,ou=suborg1,o=myorg)

0 comments:
Post a Comment