Database Administrators Asked by pbarney on October 28, 2021
This question is about possible approaches for time stamping every column of every row in a table.
I have a table with a small number of rows (fewer than 200 and not likely to grow much larger). Each row refers to an individual entity and each column corresponds to piece of data about that entity. The columns will be updated whenever new information becomes available, and I need to record when that happens.
For example, think in terms of a contacts table:
CREATE TABLE `contacts` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(100),
`address` VARCHAR(100),
`phone_number` VARCHAR(100),
`email_address` VARCHAR(100),
`updated` TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
);
If the phone number is updated this month, we might be inclined to believe that the `email_address` was updated as well, even though it may be severely outdated.
I can think of three approaches:
1. Create an additional timestamp column for each column:
CREATE TABLE `contacts` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(100),
`name_updated` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`address` VARCHAR(100),
`address_updated` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`phone_number` VARCHAR(100),
`phone_number_updated` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`email_address` VARCHAR(100),
`email_address_updated` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
);
2. Or, I could created a secondary table that contains the same information:
CREATE TABLE `contacts_updated` (
`id` INT,
`name` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`address` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`phone_number` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`email_address` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
CONSTRAINT `fk_contacts_updated_contacts_id`
FOREIGN KEY (id)
REFERENCES contacts (id));
In which case, is there an easy way to duplicate a table and automatically change the column types to timestamps
? And use triggers to update the secondary table?
3. Or, I could create an Entity-Attribute-Value table:
CREATE TABLE `contacts_updated` (
`id` INT,
`column_name` VARCHAR(15),
`updated` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
);
…but this seems like it would be more difficult to query the information.
Are there other options that would be better? And I also assume that the answer will be different for a small dataset vs. a large one, so I’d love to know your opinion on both cases.
I love this question because I answered another question like it 7 years ago
June 30, 2013
: Get the Time of Last update of a columnYou actually wrote the SQL I had suggested. I am impressed.
IMHO I would go with Suggestion #3 because its implementation is as normalized as you are going to get. The retrieval of the deltas for any column would consume the least amount of memory in the InnoDB Buffer Pool, memory within the DB Connection, and the smallest footprint on disk.
The other suggestions would have so much redundant data that retrieving the delta info would most likely push your working dataset out of the InnoDB Buffer Pool frequently.
Surprisingly, there is an alternative.
MariaDB has implemented all the above suggestions into a table metadata framework called a Temporal Table. What you are going to implement falls under the category "System-Versioned", which tracks the Change History of all table rows. The best part about "System-Versioned Temporal Tables" is that you do not need to explicitly create and track timestamps. That's done for you.
The only drawbacks I can think of are the following:
Answered by RolandoMySQLDBA on October 28, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP