| Home | Register | Members | Search | Windows Vista Tips | File Database | Links |
![]() |
| Thread Tools | Display Modes |
|
|
|
| |
|
Frank
Guest
Posts: n/a
|
I tryed this solution:
There was a NIC Team and was "load balancing and failover". I read somewhere that Microsoft do not support load balancing so now the NIC are only teamed as failover, no more load balancing. Could be this the cause of the problem? Plase advise.... "Frank" wrote: > On 30.04.2009, during the night when nobody works, only the backup is > running, the 3 cluster nodes became completely unavailable. > The network peoples says that switches was ok, network was ok, dns and wins > was also ok, so network was up and running fine for all our 30 servers. > Only the cluster suffered from this situation. > > The windows system event logs says: > The node lost communication with cluster node 'SRV0002A' on network > 'CLS00002_Public'. > The node lost communication with cluster node 'SRV0002B' on network > 'CLS00002_Public'. > The node lost communication with cluster node 'SRV0002C' on network > 'CLS00002_Public'. > and then: > The interface for cluster node 'SRV0002C' on network 'CLS00002_Public' is > unreachable by at least one other cluster node attached to the network. the > server cluster was not able to determine the location of the failure. Look > for additional entries in the system event log indicating which other nodes > have lost communication with node SRV0002C. If the condition persists, check > the cable connecting the node to the network. Next, check for hardware or > software errors in the node's network adapter. Finally, check for failures in > any other network components to which the node is connected such as hubs, > switches, or bridges. > > then: > Cluster network 'CLS00002_Public' is partitioned. Some attached server > cluster nodes cannot communicate with each other over the network. The server > cluster was not able to determine the location of the failure. Look for > additional entries in the system event log indicating which nodes have lost > communication. If the condition persists, check for failures in any network > components to which the nodes are connected such as hubs, switches, or > bridges. Also check for hardware or software errors in the adapters that > attach the nodes to the network. > then: > The interface for cluster node 'SRV0002C' on network 'CLS00002_Public' is > operational (up). The node can communicate with all other available cluster > nodes on the network. > The interface for cluster node 'SRV0002A' on network 'CLS00002_Public' > failed. If the condition persists, check the cable connecting the node to the > network. Next, check for hardware or software errors in node's network > adapter. Finally, check for failures in any network components to which the > node is connected such as hubs, switches, or bridges. > The interface for cluster node 'SRV0002B' on network 'CLS00002_Public' is > operational (up). The node can communicate with all other available cluster > nodes on the network. > Cluster network 'CLS00002_Public' is operational (up). All available server > cluster nodes attached to the network can communicate using it. > > and finally the first RED ERROR: > The TCP/IP interface for Cluster IP Address 'SQL IP Address1(VSRVSQL)' has > failed. > > after 1 minute: > This computer was not able to set up a secure session with a domain > controller in domain LUGANO due to the following: > There are currently no logon servers available to service the logon request. > This may lead to authentication problems. Make sure that this computer is > connected to the network. If the problem persists, please contact your domain > administrator. and: > > The master browser has received a server announcement from the computer > SRV0AD01 that believes that it is the master browser for the domain on > transport NetBT_Tcpip_{8E9F3304-6211-4472-. The master browser is stopping or > an election is being forced. > > Here is a part of the cluster log: > > > 00000848.000009c0::2009/04/30-00:34:09.222 INFO [ClMsg] Received interface > unreachable event for node 1 network 1 > 00000848.00000944::2009/04/30-00:34:09.222 WARN [NM] Communication was lost > with interface 4358ed8c-0534-4a6b-b396-2fb47e46baad (node: SRV0002A, network: > CLS00002_Public) > 00000848.00000fb4::2009/04/30-00:34:12.035 WARN [NM] Interface > 0d770674-50a5-4dd6-9d40-ee7d643fe932 is unreachable (node: SRV0002C, network: > CLS00002_Public). > 00000848.00000fb4::2009/04/30-00:34:12.035 WARN [NM] Interface > 4358ed8c-0534-4a6b-b396-2fb47e46baad is unreachable (node: SRV0002A, network: > CLS00002_Public). > 00000848.00000fb4::2009/04/30-00:34:12.035 WARN [NM] Interface > 1acfb388-0ad4-4b78-8cb2-39c04a6b888a is unreachable (node: SRV0002B, network: > CLS00002_Public). > 00000848.00000fb4::2009/04/30-00:34:12.035 WARN [NM] Network > c120d86b-291f-45b7-bed6-39eda87acc33 (CLS00002_Public) is partitioned. > 00000848.00000fb4::2009/04/30-00:34:12.035 INFO [GUM] s_GumUpdateNode: > completed update seq 225906 type 2 context 15 > 00000a08.00000cb0::2009/04/30-00:34:12.035 WARN IP Address <Cluster IP > Address>: WorkerThread: NetInterface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a > changed to state 2. > 00000a08.00000cb0::2009/04/30-00:34:12.035 WARN IP Address <IP Address > 10.1.0.153>: WorkerThread: NetInterface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a > changed to state 2. > 00000848.00000fb4::2009/04/30-00:34:15.785 INFO [GUM] s_GumUpdateNode: > dispatching seq 225907 type 2 context 15 > 00000848.00000fb4::2009/04/30-00:34:15.785 INFO [NM] Received update to set > state for network c120d86b-291f-45b7-bed6-39eda87acc33. > 00000848.00000fb4::2009/04/30-00:34:15.785 INFO [NM] Interface > 0d770674-50a5-4dd6-9d40-ee7d643fe932 is up (node: SRV0002C, network: > CLS00002_Public). > 00000848.00000fb4::2009/04/30-00:34:15.785 WARN [NM] Interface > 4358ed8c-0534-4a6b-b396-2fb47e46baad failed (node: SRV0002A, network: > CLS00002_Public). > 00000848.00000fb4::2009/04/30-00:34:15.785 INFO [NM] Interface > 1acfb388-0ad4-4b78-8cb2-39c04a6b888a is up (node: SRV0002B, network: > CLS00002_Public). > 00000848.00000fb4::2009/04/30-00:34:15.785 WARN [NM] Network > c120d86b-291f-45b7-bed6-39eda87acc33 (CLS00002_Public) is up. > 00000848.00000fb4::2009/04/30-00:34:15.785 INFO [GUM] s_GumUpdateNode: > completed update seq 225907 type 2 context 15 > 00000a08.00000cb0::2009/04/30-00:34:15.785 WARN IP Address <Cluster IP > Address>: WorkerThread: NetInterface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a > changed to state 3. > 00000a08.00000cb0::2009/04/30-00:34:15.785 WARN IP Address <IP Address > 10.1.0.153>: WorkerThread: NetInterface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a > changed to state 3. > 00000848.00000944::2009/04/30-05:51:55.747 WARN [NM] Communication was lost > with interface 0d770674-50a5-4dd6-9d40-ee7d643fe932 (node: SRV0002C, network: > CLS00002_Public) > 00000848.000009bc::2009/04/30-05:51:55.747 INFO [RGP] Node 2: RGP > Unicast: 0x2, 0x0, 0x0, 0x0. > 00000848.000009bc::2009/04/30-05:51:55.747 INFO [RGP] Node 2: RGP Incoming > pkt: 0x3fff, 0x44, 0x3, 0x2. > 00000848.000009bc::2009/04/30-05:51:55.747 INFO [RGP] Node 2: RGP recv > pkt : 0x440003, 0xc000c000, 0xc0000000, 0x1. > 00000848.000009bc::2009/04/30-05:51:55.747 INFO [RGP] Node 2: RGP > Unicast: 0x3, 0x0, 0x0, 0x0. > 00000848.00000944::2009/04/30-05:51:55.747 INFO [NM] Started connectivity > report timer (600ms) for network c120d86b-291f-45b7-bed6-39eda87acc33 > (CLS00002_Public) > 00000848.00000944::2009/04/30-05:51:55.747 WARN [NM] Communication was lost > with interface 8fd55b06-7512-4a1f-a231-fe6f7c406c26 (node: SRV0002C, network: > Private)00000848.000009c8::2009/04/30-05:52:00.497 WARN [EVT] EvtBroadcaster: > EvPropEvents for node 3 failed. status 1818 > 00000848.00000644::2009/04/30-05:52:01.622 WARN [ClNet] Tcpip is not bound > to adapter 2ADFF0E1-B750-4142-8C0E-9FEACE63A57D. > 00000848.00000644::2009/04/30-05:52:01.622 WARN [ClNet] Tcpip is not bound > to adapter 7BFD012D-60C5-4CEF-8E9F-04809ADFADBF. > 00000848.00000644::2009/04/30-05:52:01.622 WARN [ClNet] Tcpip is not bound > to adapter 45B8DC3C-D526-4467-8191-A1F877E01FC4. > 00000848.00000ef8::2009/04/30-05:52:01.638 WARN [NM] Interface > 8fd55b06-7512-4a1f-a231-fe6f7c406c26 is unavailable (node: SRV0002C, network: > Private). > 00000848.00000ef8::2009/04/30-05:52:01.638 INFO [GUM] s_GumUpdateNode: > completed update seq 225916 type 2 context 15 > 00000848.00000ef8::2009/04/30-05:52:01.653 INFO [NM] Received request to get > ping address enum for interface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a. > 00000848.00000ef8::2009/04/30-05:52:01.669 INFO [NM] Received request to > ping targets for interface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a. > 00000848.00000ef8::2009/04/30-05:52:01.669 INFO [NM] Pinging targets for > interface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a. > 00000848.00000ef8::2009/04/30-05:52:01.669 INFO [NM] Pinging host 10.1.0.1 > 00000848.00000ef8::2009/04/30-05:52:01.669 INFO [NM] Ping of host 10.1.0.1 > succeeded. > 00000848.00000ef8::2009/04/30-05:52:01.669 INFO [NM] Finished pinging > targets for interface 1acfb388-0ad4-4b78-8cb2-39c04a6b888a. > 00000848.00000ef8::2009/04/30-05:52:05.419 INFO [GUM] s_GumUpdateNode: > dispatching seq 225917 type 0 context 18 > 00000848.00000ef8::2009/04/30-05:52:05.419 INFO [FM] > FmpUpdateUseRandomizedNodeListForGroups: 3 node down has been processed > already... > 00000848.00000ef8::2009/04/30-05:52:05.419 INFO [GUM] s_GumUpdateNode: > completed update seq 225917 type 0 context 18 > 00000848.00000ef8::2009/04/30-05:52:17.419 INFO [GUM] s_GumUpdateNode: > dispatching seq 225918 type 2 context 15 > 00000848.00000ef8::2009/04/30-05:52:17.419 INFO [NM] Received update to set > state for network c120d86b-291f-45b7-bed6-39eda87acc33. > 00000848.00000ef8::2009/04/30-05:52:17.419 WARN [NM] Interface > 0d770674-50a5-4dd6-9d40-ee7d643fe932 is unavailable (node: SRV0002C, network: > CLS00002_Public). > 00000848.00000ef8::2009/04/30-05:52:17.419 INFO [GUM] s_GumUpdateNode: > completed update seq 225918 type 2 context 15 > 00000848.00000980::2009/04/30-05:55:00.190 INFO [Qfs] GetDiskFreeSpaceEx > Z:\MSCS\, status 0 > 00000848.0000084c::2009/04/30-05:59:19.448 INFO [CS] Received service > shutdown command > 00000848.0000092c::2009/04/30-05:59:19.464 WARN [INIT] The cluster service > is shutting down. > 00000848.0000092c::2009/04/30-05:59:19.464 INFO [EVT] EvShutdown > 00000848.0000092c::2009/04/30-05:59:19.464 WARN [FM] Shutdown: Failover > Manager requested to shutdown groups. > 00000848.0000092c::2009/04/30-05:59:19.464 INFO [FM] FmpCleanupGroups: Entry > 00000848.00000408::2009/04/30-05:59:19.464 INFO [FM] FmpCleanupGroupsWorker: > Entry > 00000848.00000408::2009/04/30-05:59:19.464 INFO [FM] FmpCleanupGroupsPhase1: > Entry, Group = 887f74f8-9777-4ace-97d9-dddfb6b15b4a > 00000848.00000408::2009/04/30-05:59:19.464 INFO [FM] > FmpNotifyGroupStateChangeReason: Notifying group Cluster Group > [887f74f8-9777-4ace-97d9-dddfb6b15b4a] of state change reason 4... > 00000848.00000408::2009/04/30-05:59:19.464 INFO [FM] FmpOfflineGroup, > Group=887f74f8-9777-4ace-97d9-dddfb6b15b4a > 00000848.00000408::2009/04/30-05:59:19.464 INFO [FM] FmpOfflineResource: > Cluster Name depends on Cluster IP Address. Shut down first. > 00000848.00000408::2009/04/30-05:59:19.464 INFO [FM] FmpRmOfflineResource: > InterlockedIncrement on gdwQuoBlockingResources for resource > 866483a6-4c2d-49c7-b143-2d26d08b86b7 > 00000a08.00000a24::2009/04/30-05:59:19.464 INFO Network Name <Cluster Name>: > Taking resource offline... > 00000a08.00000b40::2009/04/30-05:59:19.464 INFO Network Name <Cluster Name>: > Offline of resource continuing... > 00000848.00000408::2009/04/30-05:59:19.464 INFO [FM] FmpRmOfflineResource: > RmOffline() for 866483a6-4c2d-49c7-b143-2d26d08b86b7 returned error 997 > 00000a08.00000c34::2009/04/30-05:59:19.464 INFO Network Name: time until > next DNS reg: 2009/04/30-15:07:48 (128855776686664899) > 00000848.00000408::2009/04/30-05:59:19.464 INFO [GUM] GumSendUpdate: queuing > update type 0 context 8 > 00000a08.00000b40::2009/04/30-05:59:19.464 WARN Network Name <Cluster Name>: > Failed to delete server name CLS00002, status 5. > 00000a08.00000b40::2009/04/30-05:59:19.464 WARN Network Name <Cluster Name>: > Failed to delete server name CLS00002, status 5. > 00000a08.00000b40::2009/04/30-05:59:19.464 INFO Network Name <Cluster Name>: > Deleted workstation name CLS00002 from transport 0. > 00000848.00000408::2009/04/30-05:59:19.464 INFO [GUM] GumSendUpdate: > Dispatching seq 225919 type 0 context 8 to node 2 > 00000848.00000408::2009/04/30-05:59:19.464 INFO [GUM] GumSendUpdate: > completed update seq 225919 type 0 context 8 > > What can be the cause of such a situation? > Where do you suggest to search for a cause? > > Please advise. > |
|
|
|
|
|||
|
|||
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Our high end SQL server cluster is maxed out, how else to expand? Will it help to move tables off of the cluster onto other clusters or will that just create processing bottleneck on the cluster running SQL server? | Daniel | Clustering | 1 | 07-23-2007 11:10 AM |
| Our high end SQL server cluster is maxed out, how else to expand? Will it help to move tables off of the cluster onto other clusters or will that just create processing bottleneck on the cluster running SQL server? | Daniel | Server Networking | 0 | 07-20-2007 07:02 PM |
| what i need to build a completely virtual windows 2003 R2 cluster | auldh | Server Networking | 3 | 05-03-2007 01:26 PM |
| Print Queues Briefly Unavailable on Cluster | Stuart Burns | Clustering | 1 | 04-24-2006 07:38 PM |
| Re: Unavailable public interfaces in a w2k cluster | Mike Rosado [MSFT] | Clustering | 1 | 03-16-2005 01:33 PM |
Forum Software Powered by vBulletin®, Copyright Jelsoft Enterprises Ltd.
SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc. |



Linear Mode

