| Home | Register | Members | Search | Windows Vista Tips | File Database | Links |
![]() |
| Thread Tools | Display Modes |
|
Matthias
Guest
Posts: n/a
|
Hello all,
yesterday one of our clustersystems do an unexpexted clusterswitch. Systeminformation: HW: ProLiant DL585 G1 / 2x AMD Opteron 2,2 GHz / 16 GB RAM OS: Microsoft Windows Server 2003 Enterprise x64 Edition OS Version: 5.2.3790 Service Pack 2 Build 3790 HP ProLiant Support Pack 7.90 Atached to a SAN via FC Main Software: SAP CRM 5.0 SP15 / on MS SQL Server 2005 Support Software: DataProdector / McAfee (Enterp. 8.0.0 Patch 15) / SNARE 3.0.0 MSCS-Configuration: Userlan (Teaming) Serverlan ( NO-Team) PrivatLAN (crossover) Clustergoup / MSDTC-Group / SAP-Group / SQL-Group __________________________________________________ _________- The Clusterlog: 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [DM] DmpGetSnapShotCb: DmpGetDatabase returned 0x00000000 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsGetTempFileName Q:\MSCS\, chkpt, 8011 => Q:\MSCS\chk1F4B.tmp, status 0 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] DmpGetSnapshotCb: Checkpoint file name=Q:\MSCS\chk1F4B.tmp Seq#=8011 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsMoveFileEx Q:\MSCS\chk619F.tmp=>Q:\MSCS\chk1F4B.tmp 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] DmpGetSnapShotCb: Failed to move the temp file to checkpoint file, TempFileName=Q:\MSCS\chk619F.tmp, ChkPtFileName=Q:\MSCS\chk1F4B.tmp, Error=0x00000005 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsDeleteFile Q:\MSCS\chk619F.tmp, status 0 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] LogCheckPoint: Callback failed to return a checkpoint 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] LogpReset:: Callback failed to return a checkpoint, error=5 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogClose : Entry LogFile=0x02ad7df0 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogFlush : pLog=0x02ad7df0 writing the 1024 bytes for active page at offset 0x00000400 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] WriteFile 99c (....) 1024, status 0 (0=>0) 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsFlushBuffers 99c, status 0 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsCloseHandle 99c, status 0 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogClose : Exit returning success 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsDeleteFile Q:\MSCS\tqu619E.tmp, status 0 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogpReset exit, returning 0x00000005 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogReset exit, returning 0x00000005 0000098c.00000a64::2008/05/15-15:16:43.912 ERR [DM]DmpCheckpointTimerCb - Failed to reset log, error=5 0000098c.00000a64::2008/05/15-15:16:44.005 ERR Cluster service suffered an unexpected fatal error at line 2324 of source module d:\nt\base\cluster\service\dm\dmlog.c. The error code was 5. 00000f58.00000f5c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, Shutdown = 0. 00000f58.00000f5c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = 00000000 00000f58.00000f5c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" 00000f58.00000f5c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown notification. 00000f38.00000f3c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, Shutdown = 0. 00000f38.00000f3c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = 00000000 00000f38.00000f3c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" 00000f38.00000f3c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown notification. 00000f18.00000f1c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, Shutdown = 0. 00000f18.00000f1c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = 00000000 00000f18.00000f1c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" 00000f18.00000f1c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown notification. 00000b70.00000b74::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, Shutdown = 0. 00000b70.00000b74::2008/05/15-15:16:45.004 ERR [RM] Active Resource = 00000000 00000b70.00000b74::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" 00000b70.00000b74::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown notification. 00000b70.00000b74::2008/05/15-15:16:45.004 INFO SAP Resource <SAP CPR 00 Instance>: ResourceControl request. 00000f18.00000f34::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges shutting down. 00000f38.00000f54::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges shutting down. 00000f58.00000f74::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges shutting down. 00000b70.00000f08::2008/05/15-15:16:45.035 INFO [RM] NotifyChanges shutting down. 00000b70.00000f10::2008/05/15-15:16:45.050 INFO Physical Disk <Disk H:>: [DiskArb] CompletionRoutine, status 0. 00000b70.00000f10::2008/05/15-15:16:45.050 INFO Physical Disk <Disk H:>: There are also Errors in the Eventlog: Event Type: Error Event Source: ClusSvc Event Category: Log Mgr Event ID: 1016 Date: 15.05.2008 Time: 17:16:43 User: N/A Computer: NODE1 Description: Cluster service failed to obtain a checkpoint from the server cluster database for log file Q:\MSCS\tqu619E.tmp. Next: Event Type: Error Event Source: ClusSvc Event Category: Database Mgr Event ID: 1000 Date: 15.05.2008 Time: 17:16:43 User: N/A Computer: NODE1 Description: Cluster service suffered an unexpected fatal error at line 2324 of source module d:\nt\base\cluster\service\dm\dmlog.c. The error code was 5. alot of: Event Type: Warning Event Source: Ftdisk Event Category: Disk Event ID: 57 Date: 15.05.2008 Time: 17:16:45 User: N/A Computer: NODE1 Description: The system failed to flush data to the transaction log. Corruption may occur. And: Event Type: Error Event Source: Service Control Manager Event Category: None Event ID: 7031 Date: 15.05.2008 Time: 17:16:45 User: N/A Computer: NODE1 Description: The Cluster Service service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service. I found the KB http://support.microsoft.com/kb/321531/en-us but I can not belive that our virusscanner is the reason because we EXCLUDE all recommented Drives and Files ( e.q Quorumdrive/ Databasedives / DatabasLOG-Drives/ SQL-Executables, Pagefile, C:\Windows\Cluster, ..\NTDS, ..ntfsr, ..SYSVOL, *.chk, *.ebd, *.ldf, *.log, *.mdf, *.ndf, *.stm) from read and write scan. Anyone has an idea ? br, Matthias ____________________________________________ Matthias Schweifer - Austria |
|
|
|
|
|||
|
|||
|
|
|
| |
|
Jeff Hughes [MSFT]
Guest
Posts: n/a
|
Error 5 is an access denied and it occurred when we were checkpointing the
cluster registry to the quorum drive. Check and make sure the cluster service account has both the 'backup files and directories' and 'restore files and directories' user rights. Also, make sure your Antivirus is NOT scanning the quorum. If it was scanning a quorum file at the time of a checkpoint, that may explain the error 5. -- Jeff Hughes, MCSE Support Escalation Engineer Microsoft Enterprise Platforms Support (Server Core/Cluster) "Matthias" <> wrote in message news:8D5AFD2A-3EBD-4297-B8A3-... > Hello all, > yesterday one of our clustersystems do an unexpexted clusterswitch. > > Systeminformation: > > HW: ProLiant DL585 G1 / 2x AMD Opteron 2,2 GHz / 16 GB RAM > OS: Microsoft Windows Server 2003 Enterprise x64 Edition > OS Version: 5.2.3790 Service Pack 2 Build 3790 > > HP ProLiant Support Pack 7.90 > > Atached to a SAN via FC > > Main Software: SAP CRM 5.0 SP15 / on MS SQL Server 2005 > Support Software: DataProdector / McAfee (Enterp. 8.0.0 Patch 15) / SNARE > 3.0.0 > > MSCS-Configuration: > > Userlan (Teaming) > Serverlan ( NO-Team) > PrivatLAN (crossover) > > Clustergoup / MSDTC-Group / SAP-Group / SQL-Group > > > __________________________________________________ _________- > > The Clusterlog: > > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [DM] DmpGetSnapShotCb: > DmpGetDatabase returned 0x00000000 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsGetTempFileName > Q:\MSCS\, chkpt, 8011 => Q:\MSCS\chk1F4B.tmp, status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] DmpGetSnapshotCb: > Checkpoint file name=Q:\MSCS\chk1F4B.tmp Seq#=8011 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsMoveFileEx > Q:\MSCS\chk619F.tmp=>Q:\MSCS\chk1F4B.tmp > 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] DmpGetSnapShotCb: > Failed to move the temp file to checkpoint file, > TempFileName=Q:\MSCS\chk619F.tmp, ChkPtFileName=Q:\MSCS\chk1F4B.tmp, > Error=0x00000005 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsDeleteFile > Q:\MSCS\chk619F.tmp, status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] LogCheckPoint: > Callback > failed to return a checkpoint > 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] LogpReset:: Callback > failed to return a checkpoint, error=5 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogClose : Entry > LogFile=0x02ad7df0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogFlush : > pLog=0x02ad7df0 writing the 1024 bytes for active page at offset > 0x00000400 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] WriteFile 99c (....) > 1024, status 0 (0=>0) > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsFlushBuffers 99c, > status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsCloseHandle 99c, > status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogClose : Exit > returning success > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsDeleteFile > Q:\MSCS\tqu619E.tmp, status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogpReset exit, > returning 0x00000005 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogReset exit, > returning 0x00000005 > 0000098c.00000a64::2008/05/15-15:16:43.912 ERR [DM]DmpCheckpointTimerCb - > Failed to reset log, error=5 > 0000098c.00000a64::2008/05/15-15:16:44.005 ERR Cluster service suffered > an > unexpected fatal error at line 2324 of source module > d:\nt\base\cluster\service\dm\dmlog.c. The error code was 5. > 00000f58.00000f5c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = > 1, > Shutdown = 0. > 00000f58.00000f5c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000f58.00000f5c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, > "" > 00000f58.00000f5c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000f38.00000f3c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = > 1, > Shutdown = 0. > 00000f38.00000f3c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000f38.00000f3c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, > "" > 00000f38.00000f3c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000f18.00000f1c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = > 1, > Shutdown = 0. > 00000f18.00000f1c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000f18.00000f1c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, > "" > 00000f18.00000f1c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000b70.00000b74::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = > 1, > Shutdown = 0. > 00000b70.00000b74::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000b70.00000b74::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, > "" > 00000b70.00000b74::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000b70.00000b74::2008/05/15-15:16:45.004 INFO SAP Resource <SAP CPR 00 > Instance>: ResourceControl request. > 00000f18.00000f34::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges > shutting > down. > 00000f38.00000f54::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges > shutting > down. > 00000f58.00000f74::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges > shutting > down. > 00000b70.00000f08::2008/05/15-15:16:45.035 INFO [RM] NotifyChanges > shutting > down. > 00000b70.00000f10::2008/05/15-15:16:45.050 INFO Physical Disk <Disk H:>: > [DiskArb] CompletionRoutine, status 0. > 00000b70.00000f10::2008/05/15-15:16:45.050 INFO Physical Disk <Disk H:>: > > There are also Errors in the Eventlog: > > Event Type: Error > Event Source: ClusSvc > Event Category: Log Mgr > Event ID: 1016 > Date: 15.05.2008 > Time: 17:16:43 > User: N/A > Computer: NODE1 > Description: > Cluster service failed to obtain a checkpoint from the server cluster > database for log file Q:\MSCS\tqu619E.tmp. > > Next: > > Event Type: Error > Event Source: ClusSvc > Event Category: Database Mgr > Event ID: 1000 > Date: 15.05.2008 > Time: 17:16:43 > User: N/A > Computer: NODE1 > Description: > Cluster service suffered an unexpected fatal error at line 2324 of source > module d:\nt\base\cluster\service\dm\dmlog.c. The error code was 5. > > alot of: > > Event Type: Warning > Event Source: Ftdisk > Event Category: Disk > Event ID: 57 > Date: 15.05.2008 > Time: 17:16:45 > User: N/A > Computer: NODE1 > Description: > The system failed to flush data to the transaction log. Corruption may > occur. > > And: > > Event Type: Error > Event Source: Service Control Manager > Event Category: None > Event ID: 7031 > Date: 15.05.2008 > Time: 17:16:45 > User: N/A > Computer: NODE1 > Description: > The Cluster Service service terminated unexpectedly. It has done this 1 > time(s). The following corrective action will be taken in 60000 > milliseconds: Restart the service. > > I found the KB http://support.microsoft.com/kb/321531/en-us but I can not > belive that our virusscanner is the reason because we EXCLUDE all > recommented > Drives and Files ( e.q Quorumdrive/ Databasedives / DatabasLOG-Drives/ > SQL-Executables, Pagefile, C:\Windows\Cluster, ..\NTDS, ..ntfsr, ..SYSVOL, > *.chk, *.ebd, *.ldf, *.log, *.mdf, *.ndf, *.stm) from read and write scan. > > > Anyone has an idea ? > > > br, Matthias > ____________________________________________ > Matthias Schweifer - Austria > |
|
|
|
|
|||
|
|||
|
Jeff Hughes [MSFT]
Guest
Posts: n/a
|
Yes, if the quorum files were being backed up at the time, that's very
possible why you got an error 5. You do not need to backup the quorum and it should be excluded from your scheduled backups. There's nothing there you'd ever need to recover since all the quorum is used for is maintaining a copy of the cluster database and any checkpointed registry keys, and you can always recreate those files if needed. -- Jeff Hughes, MCSE Support Escalation Engineer Microsoft Enterprise Platforms Support (Server Core/Cluster) "Matthias" <> wrote in message news:1D75AFAD-DA76-4A13-B925-... > I am not the backup-administrator in our company, but as further > information > I note that there was a FILE-System FULLBACKUP on both nodes ( with HP > DataProtector) ; also the physikal QuorumDisk was backuped.... > Beginn : 17:15 > > Is that a possible reason for the erro 5 ? > Should we exclude the Quorumdisk from the backupset ? > (Is a Systemstatebackup sufficiently) > > br, matthias > |
|
|
|
|
|||
|
|||
|
steffen busch
Guest
Posts: n/a
|
Hello,
i got nearly the same messages as descriped above. But my error code is 2 Event Type: Error Event Source: ClusSvc Event Category: Database Mgr Event ID: 1000 Date: 06.06.2008 Time: 14:34:44 User: N/A Computer: SVREHDWHCLN1 Description: Cluster service suffered an unexpected fatal error at line 2236 of source module d:\nt\base\cluster\service\dm\dmlog.c. The error code was 2. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. then i got several messages: The system failed to flush data to the transaction log. Corruption may occur. After that only this messages appear: Cluster service is requesting a bus reset for device \Device\ClusDisk0. Cluster Service did not start any more: Server specific error code 5086 The cluster fails over properly and is running on the other node. But the first node died Any ideas?? I do not want to evict the node, or set up the machine new. Config: FSC Blade BX630 Win2k3 64 bit Sql 2005 SP2 IBM SVC San FC Connected Thanks for your help |
|
|
|
|
|||
|
|||
|
John Toner [MVP]
Guest
Posts: n/a
|
Not enough info here to figure out the problem, but it looks like you might
have lost connectivity to your quorum disk. Regards, John Visit my blog: http://msmvps.com/blogs/jtoner <steffen busch> wrote in message news:... > Hello, > i got nearly the same messages as descriped above. > But my error code is 2 > > Event Type: Error > Event Source: ClusSvc > Event Category: Database Mgr > Event ID: 1000 > Date: 06.06.2008 > Time: 14:34:44 > User: N/A > Computer: SVREHDWHCLN1 > Description: > Cluster service suffered an unexpected fatal error at line 2236 of source module d:\nt\base\cluster\service\dm\dmlog.c. The error code was 2. > > For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. > > then i got several messages: > > The system failed to flush data to the transaction log. Corruption may occur. > > After that only this messages appear: > > Cluster service is requesting a bus reset for device \Device\ClusDisk0. > > Cluster Service did not start any more: > > Server specific error code 5086 > > The cluster fails over properly and is running on the other node. > > But the first node died > > Any ideas?? > I do not want to evict the node, or set up the machine new. > > Config: > > FSC Blade BX630 > Win2k3 64 bit > Sql 2005 SP2 > > IBM SVC San FC Connected > > Thanks for your help |
|
|
|
|
|||
|
|||
|
praveen
Guest
Posts: n/a
|
Hi Jeff,
It will be very helpfull if you can provide a solution for one of the issue i am facing with the same Error 5. I am facing this error in a Majority node cluster which has Exchange 2007 . Cluster service could not write to a file (C:\DOCUME~1\XXX~1\LOCALS~1\Temp\CLS1348.tmp. From cluster log, 00000de8.00002fa0::2011/03/17-02:45:19.673 WARN [CP] CppCheckpoint failed to get registry database SYSTEM\CurrentControlSet\Services\MSExchangeIS\ahe xclex1 to file C:\DOCUME~1\XXXAHC~1\LOCALS~1\Temp\CLS2D86.tmp error 5 00000de8.00002fa0::2011/03/17-02:45:19.673 WARN [CP] CppRegNotifyThread CppNotifyCheckpoint due to timer failed, reset the timer. SO basically Error 5 comes for "Access denied" issue. we have Majority node set and I have ecxluded the C:\DOCUME~1\XXXAHC~1\LOCALS~1\Temp c:\Windows\Cluster from Antivirus scanning but still the error persists. Kindly help to understand the possible cause of the occurence of Error 5 in this case. > On Friday, May 16, 2008 7:07 AM Matthia wrote: > Hello all, > yesterday one of our clustersystems do an unexpexted clusterswitch. > > Systeminformation: > > HW: ProLiant DL585 G1 / 2x AMD Opteron 2,2 GHz / 16 GB RAM > OS: Microsoft Windows Server 2003 Enterprise x64 Edition > OS Version: 5.2.3790 Service Pack 2 Build 3790 > > HP ProLiant Support Pack 7.90 > > Atached to a SAN via FC > > Main Software: SAP CRM 5.0 SP15 / on MS SQL Server 2005 > Support Software: DataProdector / McAfee (Enterp. 8.0.0 Patch 15) / SNARE > 3.0.0 > > MSCS-Configuration: > > Userlan (Teaming) > Serverlan ( NO-Team) > PrivatLAN (crossover) > > Clustergoup / MSDTC-Group / SAP-Group / SQL-Group > > > __________________________________________________ _________- > > The Clusterlog: > > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [DM] DmpGetSnapShotCb: > DmpGetDatabase returned 0x00000000 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsGetTempFileName > Q:\MSCS\, chkpt, 8011 => Q:\MSCS\chk1F4B.tmp, status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] DmpGetSnapshotCb: > Checkpoint file name=Q:\MSCS\chk1F4B.tmp Seq#=8011 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsMoveFileEx > Q:\MSCS\chk619F.tmp=>Q:\MSCS\chk1F4B.tmp > 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] DmpGetSnapShotCb: > Failed to move the temp file to checkpoint file, > TempFileName=Q:\MSCS\chk619F.tmp, ChkPtFileName=Q:\MSCS\chk1F4B.tmp, > Error=0x00000005 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsDeleteFile > Q:\MSCS\chk619F.tmp, status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] LogCheckPoint: Callback > failed to return a checkpoint > 0000098c.00000a64::2008/05/15-15:16:43.912 WARN [LM] LogpReset:: Callback > failed to return a checkpoint, error=5 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogClose : Entry > LogFile=0x02ad7df0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogFlush : > pLog=0x02ad7df0 writing the 1024 bytes for active page at offset 0x00000400 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] WriteFile 99c (....) > 1024, status 0 (0=>0) > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsFlushBuffers 99c, > status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsCloseHandle 99c, > status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogClose : Exit > returning success > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [Qfs] QfsDeleteFile > Q:\MSCS\tqu619E.tmp, status 0 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogpReset exit, > returning 0x00000005 > 0000098c.00000a64::2008/05/15-15:16:43.912 INFO [LM] LogReset exit, > returning 0x00000005 > 0000098c.00000a64::2008/05/15-15:16:43.912 ERR [DM]DmpCheckpointTimerCb - > Failed to reset log, error=5 > 0000098c.00000a64::2008/05/15-15:16:44.005 ERR Cluster service suffered an > unexpected fatal error at line 2324 of source module > d:\nt\base\cluster\service\dm\dmlog.c. The error code was 5. > 00000f58.00000f5c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, > Shutdown = 0. > 00000f58.00000f5c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000f58.00000f5c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" > 00000f58.00000f5c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000f38.00000f3c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, > Shutdown = 0. > 00000f38.00000f3c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000f38.00000f3c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" > 00000f38.00000f3c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000f18.00000f1c::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, > Shutdown = 0. > 00000f18.00000f1c::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000f18.00000f1c::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" > 00000f18.00000f1c::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000b70.00000b74::2008/05/15-15:16:45.004 WARN [RM] Going away, Status = 1, > Shutdown = 0. > 00000b70.00000b74::2008/05/15-15:16:45.004 ERR [RM] Active Resource = > 00000000 > 00000b70.00000b74::2008/05/15-15:16:45.004 ERR [RM] Resource State is 1, "" > 00000b70.00000b74::2008/05/15-15:16:45.004 INFO [RM] Posting shutdown > notification. > 00000b70.00000b74::2008/05/15-15:16:45.004 INFO SAP Resource <SAP CPR 00 > Instance>: ResourceControl request. > 00000f18.00000f34::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges shutting > down. > 00000f38.00000f54::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges shutting > down. > 00000f58.00000f74::2008/05/15-15:16:45.019 INFO [RM] NotifyChanges shutting > down. > 00000b70.00000f08::2008/05/15-15:16:45.035 INFO [RM] NotifyChanges shutting > down. > 00000b70.00000f10::2008/05/15-15:16:45.050 INFO Physical Disk <Disk H:>: > [DiskArb] CompletionRoutine, status 0. > 00000b70.00000f10::2008/05/15-15:16:45.050 INFO Physical Disk <Disk H:>: > > There are also Errors in the Eventlog: > > Event Type: Error > Event Source: ClusSvc > Event Category: Log Mgr > Event ID: 1016 > Date: 15.05.2008 > Time: 17:16:43 > User: N/A > Computer: NODE1 > Description: > Cluster service failed to obtain a checkpoint from the server cluster > database for log file Q:\MSCS\tqu619E.tmp. > > Next: > > Event Type: Error > Event Source: ClusSvc > Event Category: Database Mgr > Event ID: 1000 > Date: 15.05.2008 > Time: 17:16:43 > User: N/A > Computer: NODE1 > Description: > Cluster service suffered an unexpected fatal error at line 2324 of source > module d:\nt\base\cluster\service\dm\dmlog.c. The error code was 5. > > alot of: > > Event Type: Warning > Event Source: Ftdisk > Event Category: Disk > Event ID: 57 > Date: 15.05.2008 > Time: 17:16:45 > User: N/A > Computer: NODE1 > Description: > The system failed to flush data to the transaction log. Corruption may occur. > > And: > > Event Type: Error > Event Source: Service Control Manager > Event Category: None > Event ID: 7031 > Date: 15.05.2008 > Time: 17:16:45 > User: N/A > Computer: NODE1 > Description: > The Cluster Service service terminated unexpectedly. It has done this 1 > time(s). The following corrective action will be taken in 60000 > milliseconds: Restart the service. > > I found the KB http://support.microsoft.com/kb/321531/en-us but I can not > belive that our virusscanner is the reason because we EXCLUDE all recommented > Drives and Files ( e.q Quorumdrive/ Databasedives / DatabasLOG-Drives/ > SQL-Executables, Pagefile, C:\Windows\Cluster, ..\NTDS, ..ntfsr, ..SYSVOL, > *.chk, *.ebd, *.ldf, *.log, *.mdf, *.ndf, *.stm) from read and write scan. > > > Anyone has an idea ? > > > br, Matthias > ____________________________________________ > Matthias Schweifer - Austria >> On Friday, May 16, 2008 8:27 AM Jeff Hughes [MSFT] wrote: >> Error 5 is an access denied and it occurred when we were checkpointing the >> cluster registry to the quorum drive. Check and make sure the cluster >> service account has both the 'backup files and directories' and 'restore >> files and directories' user rights. Also, make sure your Antivirus is NOT >> scanning the quorum. If it was scanning a quorum file at the time of a >> checkpoint, that may explain the error 5. >> -- >> Jeff Hughes, MCSE >> Support Escalation Engineer >> Microsoft Enterprise Platforms Support (Server Core/Cluster) >> >> >> "Matthias" <> wrote in message >> news:8D5AFD2A-3EBD-4297-B8A3-... >>> On Friday, May 16, 2008 8:37 AM Matthia wrote: >>> I am not the backup-administrator in our company, but as further information >>> I note that there was a FILE-System FULLBACKUP on both nodes ( with HP >>> DataProtector) ; also the physikal QuorumDisk was backuped.... >>> Beginn : 17:15 >>> >>> Is that a possible reason for the erro 5 ? >>> Should we exclude the Quorumdisk from the backupset ? >>> (Is a Systemstatebackup sufficiently) >>> >>> br, matthias >>>> On Tuesday, May 20, 2008 10:35 AM Jeff Hughes [MSFT] wrote: >>>> Yes, if the quorum files were being backed up at the time, that's very >>>> possible why you got an error 5. You do not need to backup the quorum and it >>>> should be excluded from your scheduled backups. There's nothing there you'd >>>> ever need to recover since all the quorum is used for is maintaining a copy >>>> of the cluster database and any checkpointed registry keys, and you can >>>> always recreate those files if needed. >>>> -- >>>> Jeff Hughes, MCSE >>>> Support Escalation Engineer >>>> Microsoft Enterprise Platforms Support (Server Core/Cluster) >>>> >>>> >>>> "Matthias" <> wrote in message >>>> news:1D75AFAD-DA76-4A13-B925-... >>>>> On Monday, June 09, 2008 12:51 PM steffen busch wrote: >>>>> Hello, >>>>> >>>>> i got nearly the same messages as descriped above. >>>>> >>>>> But my error code is 2 >>>>> >>>>> >>>>> >>>>> Event Type: Error >>>>> >>>>> Event Source: ClusSvc >>>>> >>>>> Event Category: Database Mgr >>>>> >>>>> Event ID: 1000 >>>>> >>>>> Date: 06.06.2008 >>>>> >>>>> Time: 14:34:44 >>>>> >>>>> User: N/A >>>>> >>>>> Computer: SVREHDWHCLN1 >>>>> >>>>> Description: >>>>> >>>>> Cluster service suffered an unexpected fatal error at line 2236 of source module d:\nt\base\cluster\service\dm\dmlog.c. The error code was 2. >>>>> >>>>> >>>>> >>>>> For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. >>>>> >>>>> >>>>> >>>>> then i got several messages: >>>>> >>>>> >>>>> >>>>> The system failed to flush data to the transaction log. Corruption may occur. >>>>> >>>>> >>>>> >>>>> After that only this messages appear: >>>>> >>>>> >>>>> >>>>> Cluster service is requesting a bus reset for device \Device\ClusDisk0. >>>>> >>>>> >>>>> >>>>> Cluster Service did not start any more: >>>>> >>>>> >>>>> >>>>> Server specific error code 5086 >>>>> >>>>> >>>>> >>>>> The cluster fails over properly and is running on the other node. >>>>> >>>>> >>>>> >>>>> But the first node died >>>>> >>>>> >>>>> >>>>> Any ideas?? >>>>> >>>>> I do not want to evict the node, or set up the machine new. >>>>> >>>>> >>>>> >>>>> Config: >>>>> >>>>> >>>>> >>>>> FSC Blade BX630 >>>>> >>>>> Win2k3 64 bit >>>>> >>>>> Sql 2005 SP2 >>>>> >>>>> >>>>> >>>>> IBM SVC San FC Connected >>>>> >>>>> >>>>> >>>>> Thanks for your help |
|
|
|
|
|||
|
|||
|
|
|
| |
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Loosing Printer configuration options during cluster node switch | CTI | Clustering | 1 | 08-21-2006 02:18 PM |
| cluster switch up front | Nalaka | Clustering | 2 | 11-29-2005 05:29 PM |
| switch failure - what happens to cluster | inteq | Clustering | 5 | 04-08-2005 01:34 PM |
| Using a Switch for the private network in a 4 nodde cluster | Humberto Gonzalez | Clustering | 1 | 11-21-2004 08:28 AM |
| lost switch fabric on one node causes total cluster failure | Dave's not here | Clustering | 3 | 07-26-2004 08:54 PM |
Forum Software Powered by vBulletin®, Copyright Jelsoft Enterprises Ltd.
SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc. |



Linear Mode

