Home > Web Tips, webdeveloper > MySQL: tracking row creation and row change events with timestamps

MySQL: tracking row creation and row change events with timestamps

The case for separate Creation and Change properties

 
 

Sometimes when modeling a database entity – a table – you might wish to automatically keep track of when the tuple – the row – was created. In theory this information is rather technical in nature, being synthetically produced by the coder to aid in housekeeping tasks such as:

 
 

  1. Knowing if you are trying to store the latest version of a row in concurrent write operations. Optimistic concurrency control, in which you always assume you are writing the latest version – unless the current timestamp is newer – is useful here. Actually known as “backward oriented”.
  2. Knowing which of several rows was written to the database first. Usually – if you use an AUTO_INCREMENT integer for your primary key, you can use that to know which row predates another row. If you are using a natural key for your primary key then a column “creation_time” or “created_at” could be useful. Perhaps you want to compare Dubletten and use the creation time to help judge which of the doubles to discard.

 
 

 
 

Ways to store event times in MySQL

MySQL offers several data type for you to record when something happened:

 
 

  1. You can store the Unix Time in a signed integer column.
  2. You can store the system time in a DateTime column.
  3.  
     

This article will talk about using the timestamp data type.

 
 

A word of warning about the supported date range

 
 

The MySQL timestamp is a strange beast indeed. It can only store DateTime values within the Unix epoch:

 
 

The TIMESTAMP data type has a range of ‘1970-01-01 00:00:01′ UTC to ‘2038-01-19 03:14:07′ UTC. It has varying properties, depending on the MySQL version and the SQL mode the server is running in.

 
 

From <http://dev.mysql.com/doc/refman/5.1/en/datetime.html>

 
 

Whereas for the DateTime data type:

 
 

supported range is ‘1000-01-01 00:00:00′ to ‘9999-12-31 23:59:59′.

 
 

From <http://dev.mysql.com/doc/refman/5.1/en/datetime.html>

 
 

And some good ON UPDATE news

 
 

But it does have some advantages: it offers automatic initialization and is automatically updated when *any* of the columns in the table are updated.

 
 

The SQL Syntax to create a table with a timestamp which takes the current time as its default value, and is auto updated when any of the fields in the tuple change is:

 
 

CREATE TABLE t (changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP);

 
 

From <http://dev.mysql.com/doc/refman/5.1/en/timestamp.html>

 
 

The way to get two columns, with the first one recording the creation time and the second column recording the rows last updated time is to define two timestamp columns, but to set the first one to explicitly record ‘0000-00-00 00:00:00′ (YYYY-MM-DD HH:MM:SS).

 
 

created_at timestamp NOT NULL default ‘0000-00-00 00:00:00′

changed_at timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP

 
 

From <http://dev.mysql.com/doc/refman/5.1/en/timestamp.html>

 
 

But “DEFAULT CURRENT_TIMESTAMPis a gotcha too!

 
 

Here you might come across the idea of setting the default value for the column created_at to CURRENT_TIMESTAMP too:

 
 

created_at timestamp NOT NULL default CURRENT_TIMESTAMP

changed_at timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP

 
 

but that will cause the following error:

 
 

Incorrect table definition; there can be only one TIMESTAMP column with CURRENT_TIMESTAMP in DEFAULT or ON UPDATE clause

 
 

The manual says why, in a round about kind of way:

 
 

For one TIMESTAMP column in a table, you can assign the current timestamp as the default value and the auto-update value.

 
 

From <http://dev.mysql.com/doc/refman/5.1/en/timestamp.html>

 
 

A working solution, but still room for improvement

 
 

Never mind. If you don’t mind inserting a NULL explicitly into the created_at field, this is the code which works, setting the creation time on creation only and setting the changed timestamp at every row update:

Create a database and a table with the timestamp definitions as discussed above:

 
 

DROP DATABASE IF EXISTS `rq_mysqltest2`;

CREATE DATABASE IF NOT EXISTS `rq_mysqltest2` CHARACTER SET utf8 ;

USE `rq_mysqltest2`;

 
 

CREATE TABLE IF NOT EXISTS `rq_mysqltest2`.`ts`(

`something` varchar(255) NOT NULL

,`created_at` timestamp NOT NULL DEFAULT ‘0000-00-00 00:00:00′

,`changed_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

)

;

 
 

Now insert some rows, trying different ways of setting some fields explicitly to null, or ignoring them:

 
 

USE `rq_mysqltest2`.`rq_mysqltest2`;

TRUNCATE TABLE `ts`;

INSERT INTO `ts`

(`something`, `created_at`,`changed_at`)

VALUES

(’st1′,null,null)

,(’st2′,null,null)

;

 
 

INSERT INTO `ts`

(`something`, `created_at`)

VALUES

(’st3′,null)

,(’st4′,now())

,(’st5′,null)

;

 
 

INSERT INTO `ts`

(`something`, `changed_at`)

VALUES

(’st6′,null)

,(’st7′,now())

;

INSERT INTO `ts`

(`something`)

VALUES

(’st8′)

,(’st9′)

,(’st10′)

;

 
 

This gives us the following result set:

 
 

SELECT * FROM rq_mysqltest2.ts t;

something created_at changed_at
st1

13.10.2009 11:26

13.10.2009 11:26

st2

13.10.2009 11:26

13.10.2009 11:26

st3

13.10.2009 11:26

13.10.2009 11:26

st4

13.10.2009 11:26

13.10.2009 11:26

st5

13.10.2009 11:26

13.10.2009 11:26

st6 0000-00-00 00:00:00

13.10.2009 11:26

st7 0000-00-00 00:00:00

13.10.2009 11:26

st8 0000-00-00 00:00:00

13.10.2009 11:26

st9 0000-00-00 00:00:00

13.10.2009 11:26

st10 0000-00-00 00:00:00

13.10.2009 11:26

 
 

Which shows quite clearly:

 
 

  1. You don’t need to set a value for the changed_at column, omitting it from the insert clause will still timestamp the row.
  2. You must set a value – even NULL – for the created_at column, otherwise the default of 0000 is inserted. Which, by the way, is an illegal value.

 
 

An improvement using triggers

 
 

If I were a database admin, or the development lead on a lamp project, this scenario would not meet my approval. There is just too much opportunity for error here. I am likely to want to rely on the accuracy of the data in these fields – and running into rows where the created_at field is set to 0 because a dev forgot to explicitly set the column to NULL during an insert is just all too easy.

 
 

A solution which does not require the insert statement to explicitly set the field to null can be achieved by using a trigger. Triggers are just what you might guess. A row level event – an Insert, a Delete or an Update, causes the appropriate trigger to fire.

 
 

Here is the second version, using a trigger to set the created_at field no matter whether or not you referenced it in your insert statement:

 
 

DROP DATABASE IF EXISTS `rq_mysqltest2`;

CREATE DATABASE IF NOT EXISTS `rq_mysqltest2` CHARACTER SET utf8 ;

USE `rq_mysqltest2`;

 
 

CREATE TABLE IF NOT EXISTS `ts`(

`something` varchar(255) NOT NULL

,`created_at` timestamp NOT NULL DEFAULT ‘0000-00-00 00:00:00′

,`changed_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

)

;

 
 

DROP TRIGGER IF EXISTS `t_ts_b_ins`;

DELIMITER $$

create trigger `t_ts_b_ins` before insert on `ts`

for each row begin

set NEW.created_at = CURRENT_TIMESTAMP;

end

$$

DELIMITER ;

 
 

So this trigger, defined on our table, fires before a new row is inserted, and sets the value of the created_at field to the current_timestamp – regardless of what you choose to use in the insert statement.

 
 

Let’s use the following insert statements this time:

 
 

TRUNCATE TABLE `ts`;

INSERT INTO `ts`

(`something`, `created_at`,`changed_at`)

VALUES


(’st1′,’1976-12-25 12:25:48′,null)

,(’st2′,null,null)

;

 
 

INSERT INTO `ts`

(`something`, `created_at`)

VALUES

(’st3′,null)

,(’st4′,now())

,(’st5′,null)

;

 
 

INSERT INTO `ts`

(`something`, `changed_at`)

VALUES

(’st6′,null)

,(’st7′,now())

;

INSERT INTO `ts`

(`something`)

VALUES

(’st8′)

,(’st9′)

,(’st10′)

;

 
 

And we get the following result set:

 

something created_at changed_at
st1 2009-10-13 11:44:26 2009-10-13 11:44:26
st2 2009-10-13 11:44:26 2009-10-13 11:44:26
st3 2009-10-13 11:44:26 2009-10-13 11:44:26
st4 2009-10-13 11:44:26 2009-10-13 11:44:26
st5 2009-10-13 11:44:26 2009-10-13 11:44:26
st6 2009-10-13 11:44:26 2009-10-13 11:44:26
st7 2009-10-13 11:44:26 2009-10-13 11:44:26
st8 2009-10-13 11:44:26 2009-10-13 11:44:26
st9 2009-10-13 11:44:26 2009-10-13 11:44:26
st10 2009-10-13 11:44:26 2009-10-13 11:44:26

 
 

As you can see, now the created_at field is always set regardless of how the insert statements are coded. Updating the something column with the following statement:

 
 

update rq_mysqltest2.ts

set something = CONCAT(something, ‘-new’)

;

 
 

works as expected, giving us a result set with an automatically updated changed_at column:

 
 

something created_at changed_at
st1-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st2-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st3-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st4-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st5-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st6-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st7-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st8-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st9-new 2009-10-13 11:44:26 2009-10-13 11:53:32
st10-new 2009-10-13 11:44:26 2009-10-13 11:53:32

 
 

As you can see, the changed_at column reflects a newer timestamp than the created_at column, indicating that these rows have been changed.

 
 

But that still leaves us with an unsresolved problem.

 
 

The final version, accounting for attempted vandalism

 
 

If we were running some time critical application, perhaps wanting to prevent malicious users or devious programmers from manipulating our data, we would have a small problem.

 
 

The following SQL statement resets the creation time to the current timestamp:

 
 

update rq_mysqltest2.ts

set created_at = CURRENT_TIMESTAMP

;

something created_at changed_at
st1-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st2-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st3-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st4-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st5-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st6-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st7-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st8-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st9-new 2009-10-13 12:02:40 2009-10-13 12:02:40
st10-new 2009-10-13 12:02:40 2009-10-13 12:02:40

 
 

Very bad indeed.

 
 

One solution would be to deny all applications and users, except the trusted, the right to update rows in the table. Another solution would be to add a further trigger, giving it the job of negating any updates to the created_at column:

 
 

DROP DATABASE IF EXISTS `rq_mysqltest2`;

CREATE DATABASE IF NOT EXISTS `rq_mysqltest2` CHARACTER SET utf8 ;

USE `rq_mysqltest2`;

 
 

CREATE TABLE IF NOT EXISTS `ts`(

`something` varchar(255) NOT NULL

,`created_at` timestamp NOT NULL DEFAULT ‘0000-00-00 00:00:00′

,`changed_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

)

;

 
 

DROP TRIGGER IF EXISTS `t_ts_b_ins`;

DELIMITER $$

create trigger `t_ts_b_ins` before insert on `ts`

for each row begin

set NEW.created_at = CURRENT_TIMESTAMP;

end

$$

DELIMITER ;

 
 

DROP TRIGGER IF EXISTS `t_ts_b_up`;

DELIMITER $$

create trigger `t_ts_b_up` BEFORE update on `ts`

for each row begin

set NEW.created_at = OLD.`created_at`;

set NEW.changed_at = CURRENT_TIMESTAMP;

end

$$

DELIMITER ;

 
 

TRUNCATE TABLE `ts`;

INSERT INTO `ts`

(`something`, `created_at`,`changed_at`)

VALUES

(’st1′,’1976-12-25 12:25:48′,null)

,(’st2′,null,null)

;

 
 

INSERT INTO `ts`

(`something`, `created_at`)

VALUES

(’st3′,null)

,(’st4′,now())

,(’st5′,null)

;

 
 

INSERT INTO `ts`

(`something`, `changed_at`)

VALUES

(’st6′,null)

,(’st7′,now())

;

INSERT INTO `ts`

(`something`)

VALUES

(’st8′)

,(’st9′)

,(’st10′)

;

 
 

This time the explicit update clause on the created_at column has no effect, and yet we still see the timestamp for the attempted modification:

 
 

 
 

The following SQL statement resets the creation time to the current timestamp:

 
 

update rq_mysqltest2.ts

set created_at = CURRENT_TIMESTAMP

;

 
 

And as you can see, the attempted vandalism is unsuccessful, the created_at column retains its initial value:

 
 

something created_at changed_at
st1 2009-10-13 12:07:40 2009-10-13 12:10:15
st2 2009-10-13 12:07:40 2009-10-13 12:10:15
st3 2009-10-13 12:07:40 2009-10-13 12:10:15
st4 2009-10-13 12:07:40 2009-10-13 12:10:15
st5 2009-10-13 12:07:40 2009-10-13 12:10:15
st6 2009-10-13 12:07:40 2009-10-13 12:10:15
st7 2009-10-13 12:07:40 2009-10-13 12:10:15
st8 2009-10-13 12:07:40 2009-10-13 12:10:15
st9 2009-10-13 12:07:40 2009-10-13 12:10:15
st10 2009-10-13 12:07:40 2009-10-13 12:10:15

 
 

Conclusion

 
 

Wanting to store two timestamps in a single MySQL table to capture the times of the events “row creation time” and “row update time” is not as easy as one might hope.

 
 

MySQL will only automatically update a single timestamp column.

MySQL will also only set a sensible default value if

You define a CURRENT_TIMESTAMP as default

Or if you define the default to be ‘0000-00-00 00:00:00′ AND you explicitly set this field to NULL during an insert.

 
 

Robust solutions can be created by using triggers. Additional data security can be built in using further triggers, preventing malicious modification of the created_at values.

 
 

But: there is still even more room for improvement:

  • We could ditch the timestamp fields altogether, instead using triggers and DateTime columns.
  • We could elect to update the changed_at column only if the “something” column was updated.
  • We could track the number of changes and the nature of each change to a shadow table.
  • We could move the extra baggage of the tracking columns out of the main table and into a tracking table.

 
 

And some more still!

 

  • Share/Bookmark
KategorienWeb Tips, webdeveloper Tags:
  1. Bisher keine Kommentare
  1. Bisher keine Trackbacks