In my application I have a Job class as defined/outlined below. Instance of this job class represents a particular Job run. Job can have multiple checkpoints and each checkpoint can have multiple commands.
Job - JobName - [JobCheckpoint] - StartTime - EndTime - Status - ... JobCheckpoint - JobCheckpointName - [JobCommand] - StartTime - EndTime - Status - ... JobCommand - JobCommandName - [Command] - StartTime - EndTime - Status - ...
At any given day there are like 100k different jobs that runs. I want to design a user interface in Python for querying these job objects. For example users want to query
- Jobs that ran between x and y interval.
- Jobs that run command x
- Jobs in failed state.
- All checkpoints/commands of a particular job.
- And many more...
To solve this, I was thinking of providing following methods in user interface.
- List getJobs(Filter)
- List getCommands(Job)
- List getCheckpoints(Job)
I am not sure
- How Filter class will look like?
- Is returning List of domain objects correct or should I return list of dict?
- Should I take dict as an input or defined classes as an input.
- Whether this is a best design.
1 Answers
Answers 1
These are partially subjective questions. But I'll have a go at answering some of them to the best of my current knowledge and the information available in the question posed.
How Filter class will look like?
That could depend for instance on the storage mechanism. Is it stored in-memory as a bunch of Python objects or is it first taken out of an SQL database or perhaps a NoSQL database.
If it's taken from an SQL database you can take advantage of the filtering mechanism of SQL. It's after all a (Structured) Query Language.
In that case your Filter class would be like a translator of field values to a bunch of SQL operators/conditions.
If it's a bunch of Python objects without some database mechanism to use for querying your data then you might need to think of your own query/filter methods.
A Filter class might be using a Condition class and an Operator class. Maybe you have an Operator class as an abstract class and have 'glue' operators to glue conditions together (AND/OR). And another kind of operators to compare a property of a domain object with a value.
For the latter, even if you are not designing a 'filter language' for it, you could get some inspiration from an API querying format, specified here for Flask-Restless: https://flask-restless.readthedocs.io/en/stable/searchformat.html#query-format
Surely if you are designing a query interface to e.g. a REST API, Flask-Restless's query-format could give you some inspiration of how to tackle the querying.
Is returning List of domain objects correct or should I return list of dict?
Returning a list of domain objects has the advantage of being able to use inheritance. That's at least one possible advantage.
A rough sketch of certain classes:
from abc import ABCMeta, abstractmethod from typing import List class DomainObjectOperatorGlue(metaclass=ABCMeta): @abstractmethod def operate(self, haystack: List['DomainObject'], criteria: List['DomainObject']) -> List['DomainObject']: pass class DomainObjectFieldGlueOperator(metaclass=ABCMeta): @abstractmethod def operate(self, conditions: List[bool]) -> bool: pass class DomainObjectFieldGlueOperatorAnd(DomainObjectFieldGlueOperator): def operate(self, conditions: List[bool]) -> bool: # If all conditions are True then return True here, # otherwise return False. # (...) pass class DomainObjectFieldGlueOperatorOr(DomainObjectFieldGlueOperator): def operate(self, conditions: List[bool]) -> bool: # If only one (or more) of the conditions are True then return True # otherwise, if none are True, return False. # (...) pass class DomainObjectOperatorAnd(DomainObjectOperatorGlue): def __init__(self): pass def operate(self, haystack: 'JobsCollection', criteria: List['DomainObject']) -> List['DomainObject']: """ Returns list of haystackelements or empty list. Includes haystackelement if all (search) 'criteria' elements (DomainObjects) are met for haystackelement (DomainObject). """ result = [] for haystackelement in haystack.jobs: # AND operator wants all criteria to be True for haystackelement (Job) # to be included in returned search results. criteria_all_true_for_haystackelement = True for criterium in criteria: if haystackelement.excludes(criterium): criteria_all_true_for_haystackelement = False break if criteria_all_true_for_haystackelement: result.append(haystackelement) return result class DomainObjectOperatorOr(DomainObjectOperatorGlue): def __init__(self): pass def operate(self, haystack: List['DomainObject'], criteria: List['DomainObject']) -> List['DomainObject']: """ Returns list of haystackelements or empty list. Includes haystackelement if all (search) 'criteria' elements (DomainObjects) are met for haystackelement (DomainObject). """ result = [] for haystackelement in haystack: # OR operator wants at least ONE criterium to be True for haystackelement # to be included in returned search results. at_least_one_criterium_true_for_haystackelement = False for criterium in criteria: if haystackelement.matches(criterium): at_least_one_criterium_true_for_haystackelement = True break if at_least_one_criterium_true_for_haystackelement: result.append(haystackelement) return result class DomainObjectFilter(metaclass=ABCMeta): def __init__(self, criteria: List['DomainObject'], criteria_glue: DomainObjectOperatorGlue): self.criteria = criteria self.criteria_glue = criteria_glue @abstractmethod def apply(self, haystack: 'JobsCollection') -> List['DomainObject']: """ Applies filter to given 'haystack' (list of jobs with sub-objects in there); returns filtered list of DomainObjects or empty list if none found according to criteria (and criteria glue). """ return self.criteria_glue.operate(haystack, self.criteria) class DomainObject(metaclass=ABCMeta): def __init__(self): pass @abstractmethod def matches(self, domain_object: 'DomainObject') -> bool: """ Returns True if this DomainObject matches specified DomainObject, False otherwise. """ pass def excludes(self, domain_object: 'DomainObject') -> bool: """ Convenience method; the inverse of includes-method. """ return not self.matches(domain_object) class Job(DomainObject): def __init__(self, name, start, end, status, job_checkpoints: List['JobCheckpoint']): self.name = name self.start = start self.end = end self.status = status self.job_checkpoints = job_checkpoints def matches(self, domain_object: 'DomainObject', field_glue: DomainObjectFieldGlueOperator) -> bool: """ Returns True if this DomainObject includes specified DomainObject, False otherwise. """ if domain_object is Job: # See if specified fields in search criteria (domain_object/Job) matches this job. # Determine here which fields user did not leave empty, # and guess for sensible search criteria. # Return True if it's a match, False otherwise. condition_results = [] if domain_object.name != None: condition_results.append(domain_object.name in self.name) if domain_object.start != None or domain_object.end != None: if domain_object.start == None: # ...Use broadest start time for criteria here... # time_range_condition = ... condition_results.append(time_range_condition) elif domain_object.end == None: # ...Use broadest end time for criteria here... # time_range_condition = ... condition_results.append(time_range_condition) else: # Both start and end time specified; use specified time range. # time_range_condition = ... condition_results.append(time_range_condition) # Then evaluate condition_results; # e.g. return True if all condition_results are True here, # false otherwise depending on implementation of field_glue class: return field_glue.operate(condition_results) elif domain_object is JobCheckpoint: # Determine here which fields user did not leave empty, # and guess for sensible search criteria. # Return True if it's a match, False otherwise. # First establish if parent of JobCheckpoint is 'self' (this job) # if so, then check if search criteria for JobCheckpoint match, # glue fields with something like: return field_glue.operate(condition_results) elif domain_object is JobCommand: # (...) if domain_object.parent_job == self: # see if conditions pan out return field_glue.operate(condition_results) class JobCheckpoint(DomainObject): def __init__(self, name, start, end, status, job_commands: List['JobCommand'], parent_job: Job): self.name = name self.start = start self.end = end self.status = status self.job_commands = job_commands # For easier reference; # e.g. when search criteria matches this JobCheckpoint # then Job associated to it can be found # more easily. self.parent_job = parent_job class JobCommand(DomainObject): def __init__(self, name, start, end, status, parent_checkpoint: JobCheckpoint, parent_job: Job): self.name = name self.start = start self.end = end self.status = status # For easier reference; # e.g. when search criteria matches this JobCommand # then Job or JobCheckpoint associated to it can be found # more easily. self.parent_checkpoint = parent_checkpoint self.parent_job = parent_job class JobsCollection(DomainObject): def __init__(self, jobs: List['Job']): self.jobs = jobs def get_jobs(self, filter: DomainObjectFilter) -> List[Job]: return filter.apply(self) def get_commands(self, job: Job) -> List[JobCommand]: """ Returns all commands for specified job (search criteria). """ result = [] for some_job in self.jobs: if job.matches(some_job): for job_checkpoint in job.job_checkpoints: result.extend(job_checkpoint.job_commands) return result def get_checkpoints(self, job: Job) -> List[JobCheckpoint]: """ Returns all checkpoints for specified job (search criteria). """ result = [] for some_job in self.jobs: if job.matches(some_job): result.extend(job.job_checkpoints) return result
0 comments:
Post a Comment