Showing posts with label Synchronized. Show all posts
Showing posts with label Synchronized. Show all posts

Automatic Page Repair With SQL Server Database Mirroring - Part II

Please have a look to the Part I for better understanding.

In our previous part, there was lots of steps to configure the Mirroring and Manual Failover thing. I hope you would have enjoyed that. Thanks for your time on that!!


Here we will be discussing about - How the corrupted Page will be repaired automatically by the Mirroring


Just recap about the environments participated on Mirroring. We have failover it again to being back to the below state







Let me make sure the Mirroring environments are in Sync

















Everything is in Sync


1. let me corrupt some Data Page of Principal Database "PrimaryDB1" (In PANDIAN\SQL2012)


2. Here is the Data Page we trying to corrupt

:CONNECT PANDIAN\SQL2012
SELECT %%LockRes%% [Page Allocation],* FROM PrimaryDB1.dbo.TABLE1

Ok. I try to corrupt the page id "232" (232 x 8192 = 1900544)

Disclaimer: Please don't try to corrupt the data page manually in any of the PRODUCTIONS Or other live Databases environments. It may lead to corrupt and non-operational the whole database If something goes wrong!!

3. let me use XVI32.exe tool to corrupt the Page

Actually, The plan is to open the data file(.mdf) in this tool in administrative mode to make the change on it's internal allocation bits


Let me get the File path of the Database

:CONNECT PANDIAN\SQL2012
USE PrimaryDB1
GO
Exec sp_helpfile





Normally, The database should be in OFFLINE to access the underlying Data/Log files.


But, We can't do that right now since the Database is in Mirroring. let's try once


:CONNECT PANDIAN\SQL2012

USE [Master]
GO
ALTER DATABASE PrimaryDB1 SET OFFLINE

Connecting to PANDIAN\SQL2012...

Msg 1468, Level 16, State 1, Line 10
The operation cannot be performed on database "PrimaryDB1" because it is involved in a database mirroring session or an availability group. Some operations are not allowed on a database that is participating in a database mirroring session or in an availability group.
Msg 5069, Level 16, State 1, Line 10
ALTER DATABASE statement failed.
Disconnecting connection from PANDIAN\SQL2012...

So, What to we do now ?


let me Stop the Mirror Now

:CONNECT PANDIAN\SQL2012
ALTER DATABASE PrimaryDB1 SET PARTNER SUSPEND



















Let me STOP the SQL Service of the Primary database Instance(PANDIAN\SQL2012)

4. Let's corrupt the page id (232) using the Tool (XVI32.exe)

Open the Tool in Administrative mode

File --> Open --> Locate the data file (.mdf)


Address --> Goto --> 1900544


Change some of the Bits


Save the File. Its' done. We just corrupted the Data page


START the SQL Service


5. Let me access the Table


:CONNECT PANDIAN\SQL2012

SELECT %%LockRes%% [Page Allocation],* FROM PrimaryDB1.dbo.TABLE1

Yeah. See, The Page ID: 232 got corrupted as shown below

Connecting to PANDIAN\SQL2012...
Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xefd30b4e; actual: 0x67db0b46). It occurred during a read of page (1:232) in database ID 5 at offset 0x000000001d0000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\PrimaryDB1.mdf'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

let me check at the suspect pages system table 

:CONNECT PANDIAN\SQL2012
SELECT * FROM msdb.dbo.suspect_pages





See, The event_type column has 2 (Bad Checksum)

For detailed Info on msdb..suspect_pages

How do we know, The Mirroring repaired the corrupted page automatically Or Not ?

We have a system VIEW sys.dm_db_mirroring_auto_page_repair which will have a row for every automatic page-repair attempt on any mirrored database on the server instance

:CONNECT PANDIAN\SQL2012
SELECT * FROM sys.dm_db_mirroring_auto_page_repair






There is no records in this system View yet which means - Mirroring not started to repair the page yet!

Let me start the Mirror now and see...

:CONNECT PANDIAN\SQL2012
ALTER DATABASE PrimaryDB1 SET PARTNER RESUME

Wait for some time and Let the Mirroring repair the corrupted page in Principal Database

6. Let me access the corrupted Table again in Principal Database

:CONNECT PANDIAN\SQL2012
SELECT %%LockRes%% [Page Allocation],* FROM PrimaryDB1.dbo.TABLE1















It's done!!! - The corrupted page got repaired and The table back to operational

Let's make sure

:CONNECT PANDIAN\SQL2012
SELECT FROM msdb.dbo.suspect_pages





See, The event_type column has 5 now (Repaired)

:CONNECT PANDIAN\SQL2012

See, The page_status column has 5 (5 = Automatic page repair succeeded and the page should be usable)

What just happening when the page corrupted at Principal source ?

According to MSDN

a) When a read error occurs on a data page in the principal/primary database, the principal/primary inserts a row in the suspect_pages table with the appropriate error status. the principal then requests a copy of the page from the mirror.

b) The request specifies the page ID and the LSN that is currently at the end of the flushed log.

c) The page is marked as restore pending. This makes it inaccessible during the automatic page-repair attempt. Attempts to access this page during the repair attempt will fail with error 829 (restore pending)

d) After receiving the page request, the mirror/secondary waits until it has redone the log up to the LSN specified in the request. Then, the mirror/secondary tries to access the page in its copy of the database. If the page can be accessed, the mirror/secondary sends the copy of the page to the principal/primary. Otherwise, the mirror/secondary returns an error to the principal/primary, and the automatic page-repair attempt fails.

7. What will happen, If the Principal and Mirror versions are different ?

If I use SQL Server 2012 for Principal and SQL Server 2014 for Mirror


It can be able to failover from Principal to Mirror (2012 to 2014)


But, 


If I try to failover back 2014 to 2012, There will be a problem with below Err 


Error: 948, Severity: 20, State: 2.

The database 'PrimaryDB1' cannot be opened because it is version 782. This server supports version 706 and earlier. A downgrade path is not supported.

Here are the list of SQL Server Database Versions


Automatic Page Repair With SQL Server Database Mirroring - Part I

Do we need to worry about the page corruption ?

May be Yes! 

As you may know - Data are stored and maintained by the SQL Server as Page(s). Right ?

A page can have up to 8 KB of data (8192 Bytes)


When I try to fetch the data from a table, The SQL Server database engine traverse through the pages what are allocated to the tables data and flush the appropriate records to the client (Management Studio or request end-point). But,

If something goes wrong at internal structure which the system may not be able to traverse through the pages we requested because of the logical inconsistent among the pages - May end with an Error called Logical Inconstant with I/O operations!

But, The page corruption occurs in the various levels as follows
Data Pages
Index Pages
Text/Image
File header page (Page ID 0)
Page 9 (Database Boot Page)
Allocation Pages (GAM, SGAM, PFS, IAM, BCM and DCM)

For Detailed info., on each page types:
https://docs.microsoft.com/en-us/sql/relational-databases/pages-and-extents-architecture-guide?view=sql-server-ver15

The page corruptions can be repaired automatically using Mirroring or Always-On Availability Group

But, Some of the page corruptions can't be repaired automatically by Mirroring or Always-On Availability Group either

Ok. Let's enable Mirroring, try with Failover and corrupt some of the Data Page(s) in Principal database and see whether the automatic page repair works!

let's begin...

My Test Environments


Environment Details





I am going to use SQL Server 2012 Instance "PANDIAN\SQL2012" as Principal, SQL Server 2012 Instance "PANDIAN\SQL2012_1" as Mirror and SQL Server 2016 Instance "PANDIAN\SQL2016" as Witness.

Why I am using Same Version of SQL Server for Principal and Mirror ? let me tell you later

Let's change the SQL management studio session in "SQLCMD Mode" (Query menu --> SQLCMD Mode)

1. To Create a Sample database in Principal Instance
:CONNECT PANDIAN\SQL2012
CREATE DATABASE PrimaryDB1
GO

Connecting to PANDIAN\SQL2012...
Disconnecting connection from PANDIAN\SQL2012...

2. Creating sample table and records
:CONNECT PANDIAN\SQL2012
USE PrimaryDB1
GO
CREATE TABLE TABLE1(ID INT IDENTITY(1,1), NAMES VARCHAR(10))
GO
INSERT TABLE1(NAMES) VALUES('SQL'),('Server'),('SQL Server')
INSERT TABLE1(NAMES) VALUES('Mirroring'),('Failover'),('Testing')
INSERT TABLE1(NAMES) VALUES('Automatic'),('Page'),('Corruption'),('Manual')
GO

Connecting to PANDIAN\SQL2012...
(3 row(s) affected)
(3 row(s) affected)
(4 row(s) affected)
Disconnecting connection from PANDIAN\SQL2012...

3. Sample records from the Table
:CONNECT PANDIAN\SQL2012
USE PrimaryDB1
GO
SELECT * FROM TABLE1












4. Taking FULL Backup from Principal Instance
:CONNECT PANDIAN\SQL2012
BACKUP DATABASE [PrimaryDB1] TO  DISK = N'\\PANDIAN\Personal\MirroringPath\PrimaryDB1.Bak' 
WITH NOFORMAT, NOINIT, SKIP, NOREWIND, NOUNLOAD,  STATS = 10
GO

The shared path should be accessed by Mirror Instance

5. Taking LOG Backup from Principal Instance 
:CONNECT PANDIAN\SQL2012
BACKUP LOG [PrimaryDB1] TO  DISK = N'\\PANDIAN\Personal\MirroringPath\PrimaryDB1.Trn' 
WITH NOFORMAT, NOINIT,  SKIP, NOREWIND, NOUNLOADSTATS = 10
GO

6. Restore the FULL Backup in Mirror Instance
:CONNECT PANDIAN\SQL2012_1
USE [Master]
GO
RESTORE DATABASE [PrimaryDB1] FROM  DISK = N'C:\Personal\MirroringPath\PrimaryDB1.Bak' 
WITH  FILE = 1,  
MOVE N'PrimaryDB1' TO N'C:\Personal\DATA\PrimaryDB1.mdf',  
MOVE N'PrimaryDB1_log' TO N'C:\Personal\LOG\PrimaryDB1_log.ldf',  
NORECOVERY,  NOUNLOADSTATS = 5
GO

The restore should be in NORECOVERY state

7. Restore LOG Backup in Mirror Instance
:CONNECT PANDIAN\SQL2012_1
USE [Master]
GO
RESTORE LOG [PrimaryDB1] FROM  DISK = N'C:\Personal\MirroringPath\PrimaryDB1.Trn' 
WITH  FILE = 1,  NORECOVERY,  NOUNLOADSTATS = 10
GO

The LOG backup restore also should be in NORECOVERY state

8. Create ENDPOINT for the Principal Instance
:CONNECT PANDIAN\SQL2012
CREATE ENDPOINT Mirroring
AS TCP (LISTENER_PORT=5022)
FOR DATA_MIRRORING (ROLE=PARTNER,ENCRYPTION=REQUIRED ALGORITHM AES);

9. Give CONNECT Permission to the ENDPOINT for the service account in Principal Instance
:CONNECT PANDIAN\SQL2012
GRANT CONNECT ON ENDPOINT::Mirroring TO [PANDIAN\SQLServerBuddy]

Make sure the SQL Server runs under this service account

10. To make sure the ENDPOINT created
:CONNECT PANDIAN\SQL2012
SELECT serverproperty('servername') [Servername], e.name [Endpoint], e.protocol_desc [Protocol], e.type_desc [Type], e.state_desc [State], e.role_desc [Role], e.encryption_algorithm_desc [ENCRYPTION ALGORITHM],ep.Port [Endpoint Port] FROM SYS.DATABASE_MIRRORING_ENDPOINTS e join SYS.TCP_ENDPOINTS ep ON (e.endpoint_id = ep.endpoint_id) 





See, The ENDPOINT not started yet

11. Let me START the ENDPOINT
:CONNECT PANDIAN\SQL2012
ALTER ENDPOINT Mirroring STATE=STARTED;

12. To make sure the ENDPOINT started
:CONNECT PANDIAN\SQL2012
SELECT serverproperty('servername') [Servername], e.name [Endpoint], e.protocol_desc [Protocol], e.type_desc [Type], e.state_desc [State], e.role_desc [Role], e.encryption_algorithm_desc [ENCRYPTION ALGORITHM],ep.Port [Endpoint Port] FROM SYS.DATABASE_MIRRORING_ENDPOINTS e join SYS.TCP_ENDPOINTS ep ON (e.endpoint_id = ep.endpoint_id)





See, The ENDPOINT is started!

13. Create ENDPOINT for the Mirror Instance
:CONNECT PANDIAN\SQL2012_1
CREATE ENDPOINT Mirroring
AS TCP (LISTENER_PORT=5023FOR DATA_MIRRORING (ROLE=PARTNER,ENCRYPTION=REQUIRED ALGORITHM AES);

14. Give CONNECT Permission to the ENDPOINT for the service account in Mirror Instance
:CONNECT PANDIAN\SQL2012_1
GRANT CONNECT ON ENDPOINT::Mirroring TO [PANDIAN\SQLServerBuddy]

Make sure the SQL Server runs under this service account

15. Let me START the ENDPOINT
:CONNECT PANDIAN\SQL2012_1
ALTER ENDPOINT Mirroring STATE=STARTED;

16. To make sure the ENDPOINT created and started
:CONNECT PANDIAN\SQL2012_1
SELECT serverproperty('servername') [Servername], e.name [Endpoint], e.protocol_desc [Protocol], e.type_desc [Type], e.state_desc [State], e.role_desc [Role], e.encryption_algorithm_desc [ENCRYPTION ALGORITHM],ep.Port [Endpoint Port] FROM SYS.DATABASE_MIRRORING_ENDPOINTS e join SYS.TCP_ENDPOINTS ep ON (e.endpoint_id = ep.endpoint_id)





17. Create ENDPOINT for the Witness Instance
:CONNECT PANDIAN\SQL2016
CREATE ENDPOINT Mirroring
AS TCP (LISTENER_PORT=5024FOR DATA_MIRRORING (ROLE=WITNESS,ENCRYPTION=REQUIRED ALGORITHM AES);

18. Give CONNECT Permission to the ENDPOINT for the service account in Witness Instance
:CONNECT PANDIAN\SQL2016
GRANT CONNECT ON ENDPOINT::Mirroring TO [PANDIAN\SQLServerBuddy]

Make sure the SQL Server runs under this service account

19. Let me START the ENDPOINT
:CONNECT PANDIAN\SQL2016
ALTER ENDPOINT Mirroring STATE=STARTED;

20. To make sure the ENDPOINT created and started
:CONNECT PANDIAN\SQL2016
SELECT serverproperty('servername') [Servername], e.name [Endpoint], e.protocol_desc [Protocol], e.type_desc [Type], e.state_desc [State], e.role_desc [Role], e.encryption_algorithm_desc [ENCRYPTION ALGORITHM],ep.Port [Endpoint Port] FROM SYS.DATABASE_MIRRORING_ENDPOINTS e join SYS.TCP_ENDPOINTS ep ON (e.endpoint_id = ep.endpoint_id)





21. To make Principal Instance as a Partner of Mirror Instance
:CONNECT PANDIAN\SQL2012_1
ALTER DATABASE [PrimaryDB1] SET PARTNER ='TCP://PANDIAN:5022'

Connecting to PANDIAN\SQL2012_1...
Disconnecting connection from PANDIAN\SQL2012_1...

22. To make Mirror Instance as a Partner of Principal Instance
:CONNECT PANDIAN\SQL2012
ALTER DATABASE [PrimaryDB1] SET PARTNER ='TCP://PANDIAN:5023'

Connecting to PANDIAN\SQL2012...
Disconnecting connection from PANDIAN\SQL2012...

23. To make Witness Instance as a Witness of Principal Instance
:CONNECT PANDIAN\SQL2012
ALTER DATABASE [PrimaryDB1] SET WITNESS ='TCP://PANDIAN:5024'

Connecting to PANDIAN\SQL2012...
Disconnecting connection from PANDIAN\SQL2012...

It's done.....

24. Let me check the Principal Database and Is in sync









25. Let me check the Mirror Database and Is in sync








So, Both the Databases - Principal and Mirror are in Sync now! Everything will be propagated from the Principal Database to Mirror databases. 

26. Let me check the Database Mirroring Monitor and see how it goes..
Expand Principal "Databases" --> Right Click on the PrimaryDB1 --> Click Tasks --> Click "Launch Database Mirroring Monitor"


















Everything looks Good!

See, 
PANDIAN\SQL2012 acts as a Principal
PANDIAN\SQL2012_1 acts as a Mirror

and, Witness is watching from "TCP://PANDIAN:5024" which means "PANDIAN\SWL2016"

What will happen If anything goes wrong with Principal "PANDIAN\SQL2012" ? Or We manually failover it ?

In that case, Mirror instance should become as Principal and Current Principal will be act as a Mirror. Right ?

Let me Failover the Principal manually now and see

27. Manually failover the Principal
:CONNECT PANDIAN\SQL2012
USE [Master]
GO
ALTER DATABASE PrimaryDB1 SET PARTNER FAILOVER

Connecting to PANDIAN\SQL2012...
Disconnecting connection from PANDIAN\SQL2012...

28. Let me check the Database Mirroring Monitor now

















As per MSDN - On the former principal, clients are disconnected from the database and in-flight transactions are rolled back

See,

PANDIAN\SQL2012_1 acts as a Principal
PANDIAN\SQL2012 becomes as Mirror

Continues...