Windows Vista Tips

Windows Vista Tips > Newsgroups > Windows Server > Clustering > Random Event 1135

Reply
Thread Tools Display Modes

Random Event 1135

 
 
D1Artagnan
Guest
Posts: n/a

 
      04-17-2009
I'm looking for an advice how to troubleshoot Event 1135.

Scenario:
2x nodes Windows 2008 Sp1 x64 Failover Cluster (Node and File Share Majority)
Exchange 2007 Sp1 CCR
Cluster nodes and witness are on a VMware 3.5, connected to FC SAN
Additional software: McAfee Group Shield 7 Sp1 for Exchange, SCOM2007
client, SMS 2003 Advanced client and ARC Server Backup Agent for Exchange ver
12.1

Problem description:
Event 1135: Cluster node 'STLAKLMB01' was removed from the active failover
cluster membership....

This event is logged on both Active and Passive cluster nodes. In addition
the Passive node reports
Event 1069: Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed
and
Event 1564: File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.

This happened 2 times in the last one week (11:30 PM and 1:06 AM). Downtime
in both cases was about 2 minutes after which the Passive node reconnected
and the cluster recovered. The impact was that 4 out of the 6 (2 out of 6 in
the first case) Exchange 2007 storage groups failed to recover the
replication after the failure and my only option was to re-seed them in the
morning.

The stange thing here is that there aren't any events that may suggest
network failure. Furthermore the failed (passive) node keeps reporting that
both networks Public and Heartbeat are up. No other servers or infrastructure
components have registered any network otages at the time of the events.

Q1: How do I troubleshoot this failure - are there any additional logs or
tools I could use to capture more information?

Q2: How to configure the Failover Cluster to delay shutting down the
cluster. All current settings are default

Your help is much appreciated


 
Reply With Quote
 
 
 
 
John Toner [MVP]
Guest
Posts: n/a

 
      04-17-2009
1) If you go to a command line and issue a "cluster log /g" command, this
will generate a cluster.log file in the c:\windows\cluster\reports folder
that might provide additional information. Also, you can check the Failover
Cluster Operational logs for messages regarding network
messages...operational logs are under Diagnostics > Applications and Service
Logs > Microsoft > Windows > FailoverClustering

2) You cannot delay the shutdown of the cluster, but you can perform some
tweaks that might help delay the amount of time it takes to get to the point
where it is determined that the node is not available by adjusting the
heartbeat settings.

The default heartbeat value is that a heartbeat signal is sent once every
second (1000 milliseconds) and when a node misses a series of 5 heartbeats,
another node will initiate failover. You can adjust these values in Windows
2008 clusters by using the following commands:

cluster /prop SameSubnetDelay=<value>
cluster /prop SameSubnetThreshold=<value>

If your cluster nodes are on separate subnets, you would adjust the
following values instead:

cluster /prop CrossSubnetDelay=<value>
cluster /prop CrossSubnetThreshold=<value>

You can type cluster /prop to see your current settings.

Regards,
John

Visit my blog: http://msmvps.com/blogs/jtoner


"D1Artagnan" <> wrote in message
news:BA7BB4E4-F5B5-4FDF-8240-...
> I'm looking for an advice how to troubleshoot Event 1135.
>
> Scenario:
> 2x nodes Windows 2008 Sp1 x64 Failover Cluster (Node and File Share

Majority)
> Exchange 2007 Sp1 CCR
> Cluster nodes and witness are on a VMware 3.5, connected to FC SAN
> Additional software: McAfee Group Shield 7 Sp1 for Exchange, SCOM2007
> client, SMS 2003 Advanced client and ARC Server Backup Agent for Exchange

ver
> 12.1
>
> Problem description:
> Event 1135: Cluster node 'STLAKLMB01' was removed from the active failover
> cluster membership....
>
> This event is logged on both Active and Passive cluster nodes. In addition
> the Passive node reports
> Event 1069: Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)'

in
> clustered service or application 'Cluster Group' failed
> and
> Event 1564: File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share

'\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
>
> This happened 2 times in the last one week (11:30 PM and 1:06 AM).

Downtime
> in both cases was about 2 minutes after which the Passive node reconnected
> and the cluster recovered. The impact was that 4 out of the 6 (2 out of 6

in
> the first case) Exchange 2007 storage groups failed to recover the
> replication after the failure and my only option was to re-seed them in

the
> morning.
>
> The stange thing here is that there aren't any events that may suggest
> network failure. Furthermore the failed (passive) node keeps reporting

that
> both networks Public and Heartbeat are up. No other servers or

infrastructure
> components have registered any network otages at the time of the events.
>
> Q1: How do I troubleshoot this failure - are there any additional logs or
> tools I could use to capture more information?
>
> Q2: How to configure the Failover Cluster to delay shutting down the
> cluster. All current settings are default
>
> Your help is much appreciated
>
>



 
Reply With Quote
 
D1Artagnan
Guest
Posts: n/a

 
      04-22-2009
Hi John,

Thank you for your help

Cluster.log on both nodes were not very useful. The log on the active node
has not logged events between 1.04 and 22.04. The log on the passive node has
some events logged on 4th and 8th April. Both logs have no events logged for
the time of the failures.

Failover Cluster Operational Log also appears to have missed some periods of
time although not that large - no events were logged between 1:08 AM on 17.04
and 3:29 PM on 20.04. The first time stamp coincides with the time when the
cluster recovered from a failure, the second timestamp is when the backup
started

Windows System Event log seems to be the most useful. I'm not sure if the
cluster service has crashed and that caused the disconnection to the active
node, or the node has lost connectivity to the quorum and that caused the
cluster service to terminate. It also looks like there is some pattern in the
time of the fault: Occurrences in the last 2 weeks are

23.04 - From 1:05:18 AM to 1:07:51 AM
17.04 - From 1:06:18 AM to 1:07:55 AM
14.04 - From 11:30:37 PM to 11:33:09 PM

Regards,
Ilian

Windows System Log
-----------------------------------------------------------------------------------
Level Date and Time Source Event ID Task Category
Information 23/04/2009 1:07:54
a.m. Microsoft-Windows-Time-Service 37 None The time provider NtpClient is
currently receiving valid time data from nzsakldc01.nzsakl.bhp.com.au
(ntp.d|0.0.0.0:123->152.153.40.60:123).
Information 23/04/2009 1:07:51 a.m. Tcpip 4201 None The system detected that
network adapter Local Area Connection* 12 was connected to the network, and
has initiated normal operation.
Information 23/04/2009 1:07:51 a.m. Tcpip 4201 None The system detected that
network adapter Local Area Connection* 12 was connected to the network, and
has initiated normal operation.
Information 23/04/2009 1:07:52
a.m. Microsoft-Windows-Time-Service 37 None The time provider NtpClient is
currently receiving valid time data from nzsakldc01.nzsakl.bhp.com.au
(ntp.d|0.0.0.0:123->152.153.40.60:123).
Information 23/04/2009 1:07:51 a.m. Service Control Manager 7036 None The
Cluster Service service entered the running state.
Warning 23/04/2009 1:07:01
a.m. Microsoft-Windows-Time-Service 131 None NtpClient was unable to set a
domain peer to use as a time source because of DNS resolution error on
'nzsakldc01.nzsakl.bhp.com.au'. NtpClient will try again in 15 minutes and
double the reattempt interval thereafter. The error was: No such host is
known. (0x80072AF9).
Critical 23/04/2009 1:06:55
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource '' failed to arbitrate for the file
share '\\STLAKLXCH03\Quorum'. Please ensure that file share
'\\STLAKLXCH03\Quorum' exists and is accessible by the cluster.
Error 23/04/2009 1:06:56 a.m. Service Control Manager 7031 None The Cluster
Service service terminated unexpectedly. It has done this 1 time(s). The
following corrective action will be taken in 60000 milliseconds: Restart the
service.
Error 23/04/2009 1:06:56 a.m. Service Control Manager 7024 None The Cluster
Service service terminated with service-specific error 5925 (0x1725).
Information 23/04/2009 1:06:55 a.m. Service Control Manager 7036 None The
Cluster Service service entered the stopped state.
Critical 23/04/2009 1:06:49
a.m. Microsoft-Windows-FailoverClustering 1177 None "The Cluster service is
shutting down because quorum was lost. This could be due to the loss of
network connectivity between some or all nodes in the cluster, or a failover
of the witness disk.
Run the Validate a Configuration wizard to check your network configuration.
If the condition persists, check for hardware or software errors related to
the network adapter. Also check for failures in any other network components
to which the node is connected such as hubs, switches, or bridges."
Error 23/04/2009 1:06:48
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:06:47
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Critical 23/04/2009 1:06:40
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:06:40
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Error 23/04/2009 1:06:32
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:06:32
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Critical 23/04/2009 1:06:24
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:06:24
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Error 23/04/2009 1:06:15
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:06:15
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:06:08
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:06:07
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:05:59
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:05:59
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:05:51
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:05:51
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:05:44
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:05:44
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:05:37
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:05:37
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Error 23/04/2009 1:05:31
a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
clustered service or application 'Cluster Group' failed.
Critical 23/04/2009 1:05:31
a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
Resource File share witness resource 'File Share Witness
(\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
'\\STLAKLXCH03\Quorum'. Please ensure that file share '\\STLAKLXCH03\Quorum'
exists and is accessible by the cluster.
Information 23/04/2009 1:05:31 a.m. Service Control Manager 7036 None The
Windows Modules Installer service entered the running state.
Information 23/04/2009 1:05:21 a.m. Tcpip 4201 None The system detected that
network adapter Local Area Connection* 12 was connected to the network, and
has initiated normal operation.
Information 23/04/2009 1:05:21 a.m. Tcpip 4201 None The system detected that
network adapter Local Area Connection* 12 was connected to the network, and
has initiated normal operation.
Information 23/04/2009 1:05:22
a.m. Microsoft-Windows-Time-Service 37 None The time provider NtpClient is
currently receiving valid time data from nzsakldc01.nzsakl.bhp.com.au
(ntp.d|0.0.0.0:123->152.153.40.60:123).
Critical 23/04/2009 1:05:18
a.m. Microsoft-Windows-FailoverClustering 1135 None Cluster node 'STLAKLMB01'
was removed from the active failover cluster membership. The Cluster service
on this node may have stopped. This could also be due to the node having lost
communication with other active nodes in the failover cluster. Run the
Validate a Configuration wizard to check your network configuration. If the
condition persists, check for hardware or software errors related to the
network adapters on this node. Also check for failures in any other network
components to which the node is connected such as hubs, switches, or bridges.
-----------------------------------------------------------------------------------

"John Toner [MVP]" wrote:

> 1) If you go to a command line and issue a "cluster log /g" command, this
> will generate a cluster.log file in the c:\windows\cluster\reports folder
> that might provide additional information. Also, you can check the Failover
> Cluster Operational logs for messages regarding network
> messages...operational logs are under Diagnostics > Applications and Service
> Logs > Microsoft > Windows > FailoverClustering
>
> 2) You cannot delay the shutdown of the cluster, but you can perform some
> tweaks that might help delay the amount of time it takes to get to the point
> where it is determined that the node is not available by adjusting the
> heartbeat settings.
>
> The default heartbeat value is that a heartbeat signal is sent once every
> second (1000 milliseconds) and when a node misses a series of 5 heartbeats,
> another node will initiate failover. You can adjust these values in Windows
> 2008 clusters by using the following commands:
>
> cluster /prop SameSubnetDelay=<value>
> cluster /prop SameSubnetThreshold=<value>
>
> If your cluster nodes are on separate subnets, you would adjust the
> following values instead:
>
> cluster /prop CrossSubnetDelay=<value>
> cluster /prop CrossSubnetThreshold=<value>
>
> You can type cluster /prop to see your current settings.
>
> Regards,
> John
>
> Visit my blog: http://msmvps.com/blogs/jtoner
>
>
> "D1Artagnan" <> wrote in message
> news:BA7BB4E4-F5B5-4FDF-8240-...
> > I'm looking for an advice how to troubleshoot Event 1135.
> >
> > Scenario:
> > 2x nodes Windows 2008 Sp1 x64 Failover Cluster (Node and File Share

> Majority)
> > Exchange 2007 Sp1 CCR
> > Cluster nodes and witness are on a VMware 3.5, connected to FC SAN
> > Additional software: McAfee Group Shield 7 Sp1 for Exchange, SCOM2007
> > client, SMS 2003 Advanced client and ARC Server Backup Agent for Exchange

> ver
> > 12.1
> >
> > Problem description:
> > Event 1135: Cluster node 'STLAKLMB01' was removed from the active failover
> > cluster membership....
> >
> > This event is logged on both Active and Passive cluster nodes. In addition
> > the Passive node reports
> > Event 1069: Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)'

> in
> > clustered service or application 'Cluster Group' failed
> > and
> > Event 1564: File share witness resource 'File Share Witness
> > (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> > '\\STLAKLXCH03\Quorum'. Please ensure that file share

> '\\STLAKLXCH03\Quorum'
> > exists and is accessible by the cluster.
> >
> > This happened 2 times in the last one week (11:30 PM and 1:06 AM).

> Downtime
> > in both cases was about 2 minutes after which the Passive node reconnected
> > and the cluster recovered. The impact was that 4 out of the 6 (2 out of 6

> in
> > the first case) Exchange 2007 storage groups failed to recover the
> > replication after the failure and my only option was to re-seed them in

> the
> > morning.
> >
> > The stange thing here is that there aren't any events that may suggest
> > network failure. Furthermore the failed (passive) node keeps reporting

> that
> > both networks Public and Heartbeat are up. No other servers or

> infrastructure
> > components have registered any network otages at the time of the events.
> >
> > Q1: How do I troubleshoot this failure - are there any additional logs or
> > tools I could use to capture more information?
> >
> > Q2: How to configure the Failover Cluster to delay shutting down the
> > cluster. All current settings are default
> >
> > Your help is much appreciated
> >
> >

>
>
>

 
Reply With Quote
 
John Toner [MVP]
Guest
Posts: n/a

 
      04-23-2009
The cluster log is supposed to contain detailed logs for everything that is
happening in the cluster. Unfortunately, I have also seen cases where the
2008 cluster logs are missing data from the time an event had occurred...I
think that network issue might be affecting cluster logging in 2008.

Also FYI, the cluster.log events are in GMT time so you may need to
compensate for this when looking at the time in this log file.

Regards,
John

Visit my blog: http://msmvps.com/blogs/jtoner

"D1Artagnan" <> wrote in message
news:7F3175B2-2385-4A4F-BC78-...
> Hi John,
>
> Thank you for your help
>
> Cluster.log on both nodes were not very useful. The log on the active node
> has not logged events between 1.04 and 22.04. The log on the passive node
> has
> some events logged on 4th and 8th April. Both logs have no events logged
> for
> the time of the failures.
>
> Failover Cluster Operational Log also appears to have missed some periods
> of
> time although not that large - no events were logged between 1:08 AM on
> 17.04
> and 3:29 PM on 20.04. The first time stamp coincides with the time when
> the
> cluster recovered from a failure, the second timestamp is when the backup
> started
>
> Windows System Event log seems to be the most useful. I'm not sure if the
> cluster service has crashed and that caused the disconnection to the
> active
> node, or the node has lost connectivity to the quorum and that caused the
> cluster service to terminate. It also looks like there is some pattern in
> the
> time of the fault: Occurrences in the last 2 weeks are
>
> 23.04 - From 1:05:18 AM to 1:07:51 AM
> 17.04 - From 1:06:18 AM to 1:07:55 AM
> 14.04 - From 11:30:37 PM to 11:33:09 PM
>
> Regards,
> Ilian
>
> Windows System Log
> -----------------------------------------------------------------------------------
> Level Date and Time Source Event ID Task Category
> Information 23/04/2009 1:07:54
> a.m. Microsoft-Windows-Time-Service 37 None The time provider NtpClient is
> currently receiving valid time data from nzsakldc01.nzsakl.bhp.com.au
> (ntp.d|0.0.0.0:123->152.153.40.60:123).
> Information 23/04/2009 1:07:51 a.m. Tcpip 4201 None The system detected
> that
> network adapter Local Area Connection* 12 was connected to the network,
> and
> has initiated normal operation.
> Information 23/04/2009 1:07:51 a.m. Tcpip 4201 None The system detected
> that
> network adapter Local Area Connection* 12 was connected to the network,
> and
> has initiated normal operation.
> Information 23/04/2009 1:07:52
> a.m. Microsoft-Windows-Time-Service 37 None The time provider NtpClient is
> currently receiving valid time data from nzsakldc01.nzsakl.bhp.com.au
> (ntp.d|0.0.0.0:123->152.153.40.60:123).
> Information 23/04/2009 1:07:51 a.m. Service Control Manager 7036 None The
> Cluster Service service entered the running state.
> Warning 23/04/2009 1:07:01
> a.m. Microsoft-Windows-Time-Service 131 None NtpClient was unable to set a
> domain peer to use as a time source because of DNS resolution error on
> 'nzsakldc01.nzsakl.bhp.com.au'. NtpClient will try again in 15 minutes and
> double the reattempt interval thereafter. The error was: No such host is
> known. (0x80072AF9).
> Critical 23/04/2009 1:06:55
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource '' failed to arbitrate for the file
> share '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum' exists and is accessible by the cluster.
> Error 23/04/2009 1:06:56 a.m. Service Control Manager 7031 None The
> Cluster
> Service service terminated unexpectedly. It has done this 1 time(s). The
> following corrective action will be taken in 60000 milliseconds: Restart
> the
> service.
> Error 23/04/2009 1:06:56 a.m. Service Control Manager 7024 None The
> Cluster
> Service service terminated with service-specific error 5925 (0x1725).
> Information 23/04/2009 1:06:55 a.m. Service Control Manager 7036 None The
> Cluster Service service entered the stopped state.
> Critical 23/04/2009 1:06:49
> a.m. Microsoft-Windows-FailoverClustering 1177 None "The Cluster service
> is
> shutting down because quorum was lost. This could be due to the loss of
> network connectivity between some or all nodes in the cluster, or a
> failover
> of the witness disk.
> Run the Validate a Configuration wizard to check your network
> configuration.
> If the condition persists, check for hardware or software errors related
> to
> the network adapter. Also check for failures in any other network
> components
> to which the node is connected such as hubs, switches, or bridges."
> Error 23/04/2009 1:06:48
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:06:47
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Critical 23/04/2009 1:06:40
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:06:40
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Error 23/04/2009 1:06:32
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:06:32
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Critical 23/04/2009 1:06:24
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:06:24
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Error 23/04/2009 1:06:15
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:06:15
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:06:08
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:06:07
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:05:59
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:05:59
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:05:51
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:05:51
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:05:44
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:05:44
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:05:37
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:05:37
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Error 23/04/2009 1:05:31
> a.m. Microsoft-Windows-FailoverClustering 1069 Resource Control
> Manager Cluster resource 'File Share Witness (\\STLAKLXCH03\Quorum)' in
> clustered service or application 'Cluster Group' failed.
> Critical 23/04/2009 1:05:31
> a.m. Microsoft-Windows-FailoverClustering 1564 File Share Witness
> Resource File share witness resource 'File Share Witness
> (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
> '\\STLAKLXCH03\Quorum'. Please ensure that file share
> '\\STLAKLXCH03\Quorum'
> exists and is accessible by the cluster.
> Information 23/04/2009 1:05:31 a.m. Service Control Manager 7036 None The
> Windows Modules Installer service entered the running state.
> Information 23/04/2009 1:05:21 a.m. Tcpip 4201 None The system detected
> that
> network adapter Local Area Connection* 12 was connected to the network,
> and
> has initiated normal operation.
> Information 23/04/2009 1:05:21 a.m. Tcpip 4201 None The system detected
> that
> network adapter Local Area Connection* 12 was connected to the network,
> and
> has initiated normal operation.
> Information 23/04/2009 1:05:22
> a.m. Microsoft-Windows-Time-Service 37 None The time provider NtpClient is
> currently receiving valid time data from nzsakldc01.nzsakl.bhp.com.au
> (ntp.d|0.0.0.0:123->152.153.40.60:123).
> Critical 23/04/2009 1:05:18
> a.m. Microsoft-Windows-FailoverClustering 1135 None Cluster node
> 'STLAKLMB01'
> was removed from the active failover cluster membership. The Cluster
> service
> on this node may have stopped. This could also be due to the node having
> lost
> communication with other active nodes in the failover cluster. Run the
> Validate a Configuration wizard to check your network configuration. If
> the
> condition persists, check for hardware or software errors related to the
> network adapters on this node. Also check for failures in any other
> network
> components to which the node is connected such as hubs, switches, or
> bridges.
> -----------------------------------------------------------------------------------
>
> "John Toner [MVP]" wrote:
>
>> 1) If you go to a command line and issue a "cluster log /g" command, this
>> will generate a cluster.log file in the c:\windows\cluster\reports folder
>> that might provide additional information. Also, you can check the
>> Failover
>> Cluster Operational logs for messages regarding network
>> messages...operational logs are under Diagnostics > Applications and
>> Service
>> Logs > Microsoft > Windows > FailoverClustering
>>
>> 2) You cannot delay the shutdown of the cluster, but you can perform some
>> tweaks that might help delay the amount of time it takes to get to the
>> point
>> where it is determined that the node is not available by adjusting the
>> heartbeat settings.
>>
>> The default heartbeat value is that a heartbeat signal is sent once every
>> second (1000 milliseconds) and when a node misses a series of 5
>> heartbeats,
>> another node will initiate failover. You can adjust these values in
>> Windows
>> 2008 clusters by using the following commands:
>>
>> cluster /prop SameSubnetDelay=<value>
>> cluster /prop SameSubnetThreshold=<value>
>>
>> If your cluster nodes are on separate subnets, you would adjust the
>> following values instead:
>>
>> cluster /prop CrossSubnetDelay=<value>
>> cluster /prop CrossSubnetThreshold=<value>
>>
>> You can type cluster /prop to see your current settings.
>>
>> Regards,
>> John
>>
>> Visit my blog: http://msmvps.com/blogs/jtoner
>>
>>
>> "D1Artagnan" <> wrote in message
>> news:BA7BB4E4-F5B5-4FDF-8240-...
>> > I'm looking for an advice how to troubleshoot Event 1135.
>> >
>> > Scenario:
>> > 2x nodes Windows 2008 Sp1 x64 Failover Cluster (Node and File Share

>> Majority)
>> > Exchange 2007 Sp1 CCR
>> > Cluster nodes and witness are on a VMware 3.5, connected to FC SAN
>> > Additional software: McAfee Group Shield 7 Sp1 for Exchange, SCOM2007
>> > client, SMS 2003 Advanced client and ARC Server Backup Agent for
>> > Exchange

>> ver
>> > 12.1
>> >
>> > Problem description:
>> > Event 1135: Cluster node 'STLAKLMB01' was removed from the active
>> > failover
>> > cluster membership....
>> >
>> > This event is logged on both Active and Passive cluster nodes. In
>> > addition
>> > the Passive node reports
>> > Event 1069: Cluster resource 'File Share Witness
>> > (\\STLAKLXCH03\Quorum)'

>> in
>> > clustered service or application 'Cluster Group' failed
>> > and
>> > Event 1564: File share witness resource 'File Share Witness
>> > (\\STLAKLXCH03\Quorum)' failed to arbitrate for the file share
>> > '\\STLAKLXCH03\Quorum'. Please ensure that file share

>> '\\STLAKLXCH03\Quorum'
>> > exists and is accessible by the cluster.
>> >
>> > This happened 2 times in the last one week (11:30 PM and 1:06 AM).

>> Downtime
>> > in both cases was about 2 minutes after which the Passive node
>> > reconnected
>> > and the cluster recovered. The impact was that 4 out of the 6 (2 out of
>> > 6

>> in
>> > the first case) Exchange 2007 storage groups failed to recover the
>> > replication after the failure and my only option was to re-seed them in

>> the
>> > morning.
>> >
>> > The stange thing here is that there aren't any events that may suggest
>> > network failure. Furthermore the failed (passive) node keeps reporting

>> that
>> > both networks Public and Heartbeat are up. No other servers or

>> infrastructure
>> > components have registered any network otages at the time of the
>> > events.
>> >
>> > Q1: How do I troubleshoot this failure - are there any additional logs
>> > or
>> > tools I could use to capture more information?
>> >
>> > Q2: How to configure the Failover Cluster to delay shutting down the
>> > cluster. All current settings are default
>> >
>> > Your help is much appreciated
>> >
>> >

>>
>>
>>



 
Reply With Quote
 
WayCoolKennel
Guest
Posts: n/a

 
      06-05-2009
John,

This was VERY useful.. I think you are correct as we had a switch burp...
and this is EXACTLY what we saw...well except the cluster.log was definitive.
It showed that the node has lost all communication with the other nodes in
the cluster.

thanks for the good info here !

Cheers,

--Steve
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: random shutdown, event 1011, 1012, 1001, 1074 Duncan McC Windows Small Business Server 3 10-08-2009 03:10 AM
RE: random shutdown, event 1011, 1012, 1001, 1074 Robbin Meng [MSFT] Windows Small Business Server 0 10-07-2009 04:45 AM
Re: random shutdown, event 1011, 1012, 1001, 1074 Cliff Galiher Windows Small Business Server 0 10-06-2009 10:48 AM
Vista random freeze - no corresponding event? tejas.nadkarni@gmail.com Windows Vista General Discussion 22 04-17-2009 07:42 PM
1000s of Event 529 Logon Type 8 random name logon failures Macdaman Windows Small Business Server 8 12-25-2006 04:32 AM



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59