Hi Everybody,
We use the database mirroring a lot in our product solutions and we have recently experienced a strange behaviour in our failover tests with SQL2008R2.
We have 2 servers running Windows 2008 R2 standard and SQL 2008 R2 standard SP2. (let's call them DB1 and DB2)
We also have a Witness workstation running SQL 2008 Express on a Windows 7
A database from DB1 is mirrored to DB2 in "safety full" mode, with witness. At this stage, the database is principal on DB1 and mirror on DB2
To test the automatic failover, we first restart the DB1 server which has the database in principal mode
After a few seconds, the database on DB2 becomes principal, which is normal , that's exactly what we want.
After a few minutes, DB1 comes back online and its database takes the mirror role (still OK). At this stage then, the database is principal on DB2 and mirror on DB1
when the monitoring application shows that the mirror is synchronized and that both servers are connected to the witness, we restart DB2 to trigger an automatic failover to DB1.
What we see is that DB1 never takes the principal role and the database stays in mirror.
In the DB1 Errorlog, I only see these 2 lines when DB2 disappears, no other message related to the mirroring session.
2014-01-22 08:57:26.91 spid43s Starting up database 'Test123'.
2014-01-22 08:57:26.95 spid43s Bypassing recovery for database 'Test123' because it is marked as a mirror database, which cannot be recovered. This is an informational message only. No user action is required.
When DB2 comes back online, the database on DB2 keeps its principal status and the database on DB1 stays mirror.
And what is really really strange is that, if I restart DB2 once again, directly after that, DB1 failover normally and the database on DB1 takes the principal role after a few seconds. without any configuration changes between the 2 restarts.
DB1 errorlog shows then :
2014-01-22 09:00:37.53 spid29s Error: 1474, Severity: 16, State: 1.2014-01-22 09:00:37.53 spid29s Database mirroring connection error 4 'An error occurred while receiving data: '64(The specified network name is no longer available.)'.' for 'TCP://DB2:5022'.2014-01-22 09:00:37.53 spid18s Database mirroring is inactive for database 'Test123'. This is an informational message only. No user action is required.2014-01-22 09:00:42.37 spid32s The mirrored database "Test123" is changing roles from "MIRROR" to "PRINCIPAL" due to Auto Failover.2014-01-22 09:00:42.39 spid32s Recovery is writing a checkpoint in database 'Test123' (7). This is an informational message only. No user action is required.2014-01-22 09:00:42.39 spid32s Recovery completed for database Test123 (database ID 7) in 78 second(s) (analysis 0 ms, redo 0 ms, undo 7 ms.) This is an informational message only. No user action is required.So, if I summarize,
- a first failover from DB1 to DB2 always work
- then, a restart of DB2 never failover to DB1
- a second restart of DB2 always failover to DB1
This is pretty much systematic on one our server couple.
Any explanation for this or any idea where I can search to find the reason of this strange behavior ?
Thanks a lot for your help
Seb