Blog ala Vidar

SQL, AppFrame and other cool technologies

Monthly Archives: December 2007

Replication in SQL Server

Leave a comment Posted by Vidar Nordnes on December 1, 2007

Tuesday we had pizza & learning in Ølen. I held the session, and the topic was replication in SQL Server. We even had guests from a company called Quantum Solutions. Some of you might know Roar or Erik from before.
Replication is a BIG topic. We used about three hours on it, but we could have used three years on it. It’s HUGE! In addition to replication, I added Database Snapshots, Database Mirroring and Log Shipping, because I feel this is very relevant in Omega.

Database Snapshots
This is a “point in time” snapshot of a database. The purpose of this is to have a read only database to run reports, or for quick backup. Say, for instance you are going to make significant changed in a table. Delete 100000 rows or similar. For insurance you take a backup of your database before you run the query. If something fails, you have to restore the database. If we’re talking about AppFrameR2 (Omegas main database), backup plus restore takes about 1.5 hours. For 1 delete statement?! Instead of this, you could create a snapshot before you run the query, and if the query fails, you can just revert to the snapshot. When you make changes to a database with a snapshot attached, the changes will only be done on the main database. The original pages of the changes will be copied to the snapshot. This is why it’s called a point in time snapshot. If you’re going to revert, you can only have ONE snapshot. If not, you can have several.

Database Mirroring
Database mirroring is exactly what the name says: You have one database with a mirror database. The mirror database is not accessible before you change the roles of the databases. There are three different roles in database mirroring; principal, mirror and witness. Principal is the main database, where all users are connected and changes are done. Mirror is a “slave” database which gets transactions from the principal. The purpose of this is failover. If you have a witness server, you can change the roles. The principal can then be mirror, and vice versa. You can then take down the old principal (now mirror), run upgrades, add hardware etc, without the database to actually go down. When the database is up and running again, it will be synchronized with the running principal (old mirror), and when that’s done, the old mirror will be set as mirror again, and the old principal will be set to principal again. To be able to read from a mirror, you have to make a snapshot of it. It is not possible to access a mirror. Not even in management studio, right-clicking and properties.
The witness server is used for automatically failover. If the principal is not accessible (hardware crash or similar), witness will set mirror as principal till principal is up and running again. To use database mirroring you must have two standard (or enterprise) editions of SQL Server. Developer does not work. The witness may run as express edition though. For communication between the servers, database mirroring uses endpoints.
There are three modes in database mirroring: High Availability, High Performance and High Protection.

High Availability commits synchronous transactions. If it commits on the mirror, it commits on the principal AFTER it commits on the mirror. If it rolls back on the mirror, it doesn’t run the transaction on the principal. This of course takes a bit more time than running a transaction on only one database.
High Performance does the opposite. It runs the transaction on the principal, and sends it asynchronous to the mirror if it commits on the principal.
High Protection is the same as High Availability except that it does not use a witness server. For failover, you must do it manually.

Log Shipping
Log shipping is sort of like database mirroring, except it’s not running every transaction over to the mirror. In log shipping the roles are called primary, secondary and monitor. The primary server takes a backup of the transaction log, copies it to a network share. Then the secondary (you can have multiple secondary servers) copies it from the network share to a local folder, and then restores the transaction log. The monitor’s responsibility is to keep track of the process and raise alerts when the process fails.

Replication
Now over to what Microsoft calls replication. There are three different types of roles; Publisher, Distributor and Subscriber. The publisher is in most cases the only place you can do data modifications. This is the primary server. The distributor is a server which takes care of distributing changes to the subscribers, and a subscriber is the database that receives copies from the publisher (via the distributor). In some cases the subscriber is read-only. The reason for this is that many companies have the need of several read-only databases for reporting etc.
There are two types of subscription types. Push and pull. Push should be used when there’s a stable and permanent connection. The distributor copies data TO the subscriber. Pull subscription should be used when there’s not a stable connection. The subscriber then copies the data FROM the distributor.

There are four types of replication, Snapshot, Transactional, Peer-to-Peer and Merge.

Snapshot replication copies datasets and does not use the transaction log. This should only be used on small amounts and static data.
Transactional replication uses the transaction log and sends “bulks” of the transaction log.
Peer-to-Peer is the same as transactional except it all servers are publishers, distributors AND subscribers. This means that if you have 4 servers, all of these will send transactions to the other three servers.
Merge replication is like snapshot replication, except that all servers allows modifications. All servers are then both publisher and subscriber.