|
Last
revision of this document: |
This
document introduces the architecture how database-tables are made
suitable for keeping also a history of changes.
This is done by
introducing 8 attributes which are named equally throughout all
tables
The design of this architecture gave me the most controversial feedback – see Pros and Cons of the design or 'Do you get commission from the harddisk manufacturers ?'.
None.
Example
out of a running application :This
scheme shows the change of 2 attributes (Sales_Price,
Sales_Currency) of the 'Product' over a specific period of time.
Additionally to the changing attributes and the common
attributes, the user-specific key (Brand_Code, Product_Code) is
listed.
The user-specific key is the unique identifier of an
object (a 'Product' in this example) that is known to an user of the
application.
.
Common
AttributesObjectID
(DOUBLE)
Surrogate of an object which is known to an user
bei the 'user-known unique identifier' (Brand_Code and Product_Code
in the above example).
The ObjectId stays the same over time and
'ties together' all DataSets over time.
Technically it is the
DataSetId of the first dataset which was entered for a certain
'user-known unique identifier'.
DataSetID
(DOUBLE)
Primary key for the dataset as requested by the
database-system.
This attribute is generated by a random-number
generator because most database-systems honor a widespread spectrum
of the primary key of a table with a better performance.
ValidFrom
(DATE)
Date, from which on a change of an attribute is
valid.
At the time of writing this document (July 2006) the
design was, that the change of the validity is valid from midnight
on.
The exact time (down to 1/1000 of a second) is recorded in
the attribute ChangedAt.
ValidTill
(DATE)
Date, until which the previous value of the changed
attribute was valid.
ChangedBy
(VARCHAR(254))
User who initiated the change of an
attribute.
ChangedAt
(TIMESTAMP)
Timestamp,
when an attribute was changed.
CreatedBy
(VARCHAR(254))
User who entered the first dataset for a
certain 'user-known unique identifier'
CreatedAt
(TIMESTAMP)
Timestamp when the first dataset for a certain
'user-known unique identifier' was entered.
CreatedBy
and CreatedAt are redundant as the could be reproduced out of the
history of changes.
The are carried forward in all datasets as
sometimes it is requiered to know, who entered (e.g) a certain
product or customer and when the product or customer was entered.
Pros
and Cons of this design or 'Do you get commission from harddisk
manufacturers ?'Pros:
The
History is easy to write.
As each database-table contains the
history, there is no special-design for a log-file requiered.
As
the demand to implement a fault proof log grow, a database-system
(and an application which has well-designed 'commit'-points
implemented) is ideal as the 'commit'-logic takes care for a 'all or
nothing at all' strategy and helps to avoid inconsistent data.
The
History is easy to read.
The history of an object can be
easily and fast retrieved either by a 'select' using the ObjectID or
the attributes that form the 'user-known unique identifier'.
Economic
reasons (make or buy).
As
harddisk prices decrease rapidly and wages increase (slowly, but
they do ;-) ), I took an solution which was easy to implement to use
my time to push other areas of the project.
Cons:
The
space requierement is higher than with a log-file.
Yes, that
is true.
But there are several advantages using the above system
;-) .
'Do
you get commission from harddisk manufacturers ?' or experiences
after 6 years:
The
space did not explode.
In the first week I monitored very
often the grow of the database.
I was glad that it did not
explode, but the growth was more or less caused by the business of
taking orders, writing delivery notices and invoices.
Master
data make up only 20 % of all data.
Master data (for clients,
delivery adresses, products) makes up only ca. 20 % of all data.
So
the necessary changes to not contribute so much to growth caused by
the strategy of implementing the history of objects into
database-tables.
The changing of master-data is less than I
expected; it is limited to an occassional change of a clients
address or phone-number, correcting a typo or (most of all) a change
of the purchase- or sales-price of a product.
Dynamic
data is very rarely changed.
Dynamic data (orders, delivery
notices, invoices) makes up the mass of stored data.
As I could
see, there are very infrequent corrections which have to be recorded
in the history.
These corrections are mainly that a customer
cancels an order or the amount of an order position was typed wrong
and had to be corrected.
The
worst case: Quantity on Stock.
The most read and updated
database table is the one with the Quantity on Stock.
As there
was the requierement to find to each product the order, delivery
notice and invoice where it appears, the ObjectID of the
order-position was stored together with the quantity that went from
the stock to the order.
In exchange, there are no CreatedBy,
CreatedAt, ChangedBy and ChangedAt attributes in the 'Quantity on
Stock' table.
These values have to be retrieved from the
'Order-Position' table when the history of the quantity on stock of
a product is requested.
I
am not rich by the commission of harddisk manufacturers –
Unfortunately.
But I think
that the system has shown its proof.
Although, there are some
areas (e.g. the Quantity on Stock) where it has to be adapted.
Related
Documents: