im need to design some database tables but im not sure about the performance impact. In my case its more about the read performance than for saving the data.
The situation
With the help of pattern recognition im finding out how many values of a certain object needs to be saved in my postgresql database. Amount other lets say fixed properties the only difference is if 1, 2 or 3 values of the same type needs to be saved.
Currently im having 3 entities/tables which differ only in having having 1, 2 or 3 not nullable properties of the same type.
For example:
EntityTestOne/TableOne { ... other (same) properties String optionOne; } EntityTestTwo/TableTwo { ... other (same) properties String optionOne; String optionTwo; } EntityTestThree/TableThree { ... other (same) properties String optionOne; String optionTwo; String optionThree; }
I expect to have several million records in production and im thinking what could be the performance impact of this variant and what could be alternatives.
Alternatives
Other options which come into my mind:
- Use only one entity class or table with 3 options (optionTwo and optionThree will be nullable then). If to talk of millions of expected records plus caching im asking myself isn't it a kind of 'waste' to save millions of null values in at least two (caching) layers (database itself and hibernate). In a another answer i read yesterday saving a null value in postgresql need only 1 bit what i think isnt that much if we talk about several millions of records which can contain some nullable properties (link).
- Create another entity/table and use a collection (list or set) relationship instead
For example:
EntityOption { String value; } EntityTest { ... other (same) properties List<EntityOption> options; }
- If to use this relationship: What would give a better performance in case of creating new records: Creating for every new EntityTest new EntityOption's or doing a lookup before and reference a existing EntityOption if exists? What about the read performance while fetching them later and the joins which will be needed then? Compared to the variant with one plain Entity with three options i can imagine it could be slightly slower...
As im not that strong in database design and working with hibernate im interested of the pros and cons of these approaches and if there are even more alternatives. I even would like to ask the question if postgresql is the right choice for this or if should think about using another (free) database.
Thanks!
2 Answers
Answers 1
The case is pretty clear in my opinion: If you have an upper limit of three properties per object, use a single table with nullable attributes.
A NULL value does not take up any space in the database. For every row, PostgreSQL stores a bitmap that contains which attributes are NULL. This bitmap is always stored, except when all attributes are not nullable. See the documentation for details.
So don't worry about storage space in this case.
Using three different tables or storing the attributes in a separate table will probably lead to UNION
s or JOIN
s in your queries, which will make the queries more complicated and slow.
Answers 2
There are many inheritance strategy for creating entity class, I think you should go with single table strategy, where there will be a discriminator column (managed by hibernate itself), and all common filed will be used by each entity and some specific fields will be use by specific entity and remain null for other entity. This will get improved read performance. For your ref. : http://www.thejavageek.com/2014/05/14/jpa-single-table-inheritance-example/
0 comments:
Post a Comment