The error that caused your cluster to shut down likely occurs just before
this set of messages you've provided.
After cluster service terminates, it is unable to re-join the cluster
suggesting possibly a network issue. It then attempts to form a cluster,
which fails because I assume that the other node of the cluster is actively
reserving the disks.
Regards,
John
Visit my blog:
http://msmvps.com/blogs/jtoner
"Juanma" <> wrote in message
news:ef960ed0-466f-4e87-81b2-...
Hi all
I'm having issues with a 2 node cluster. From time to time the cluster
goes down .
I've attached log file for node with all errors , can't see any
significant issues on the log file for the other node . Any help will
be highly appreciated
Thanks so much
node 2:
f34:ffc.07/21[11:38:36.875](197601) WARN Physical Disk <Disco S:>:
[DiskArb] disk reservation thread canceled, status 995.
f34:ffc.07/21[11:38:36.875](197601) WARN Physical Disk <Disco S:>:
[DiskArb] CompletionRoutine: reservation lost! Status 995
f34:f38.07/21[11:38:36.890](197601) WARN [RM] Going away, Status = 1,
Shutdown = 0.
f34:f38.07/21[11:38:36.890](197601) ERR [RM] Active Resource =
00000000
f34:f38.07/21[11:38:36.890](197601) ERR [RM] Resource State is 1, ""
f34:104.07/21[11:38:37.390](197601) WARN Physical Disk <Disco F:>:
[PnP] RemoveDisk: disk 7d53900c not found or previously removed
f34:f38.07/21[11:38:37.390](197601) INFO Physical Disk <Disco F:>:
DiskCleanup returning final error 0
f34:104.07/21[11:38:37.390](197601) WARN Physical Disk <Disco M:>:
[PnP] RemoveDisk: disk 40dde501 not found or previously removed
f34:f38.07/21[11:38:37.390](197601) INFO Physical Disk <Disco M:>:
DiskCleanup returning final error 0
f34:104.07/21[11:38:37.390](197601) WARN Physical Disk <Disco Q:>:
[PnP] RemoveDisk: disk 7d53900f not found or previously removed
f34:f38.07/21[11:38:37.390](197601) INFO Physical Disk <Disco Q:>:
DiskCleanup returning final error 0
f34:f38.07/21[11:38:38.265](197601) INFO Microsoft Search Service
Instance <SQL Server Fulltext>: Se cerró la instancia <> con éxito
f34:f38.07/21[11:38:41.343](197601) WARN IP Address <Dirección IP de
SQL1>: Failed to delete NBT interface information from database,
status 1722.
f34:f38.07/21[11:38:41.343](197601) WARN IP Address <Dirección IP de
SQL1>: Failed to delete IP interface information from database, status
1722.
f34:e88.07/21[11:38:41.343](197601) ERR IP Address <Dirección IP de
SQL1>: WorkerThread: GetClusterNotify failed with status 6.
f34:f38.07/21[11:38:41.343](197601) ERR Network Name <Nombre SQL1>:
Unable to open handle to cluster, status 1753.
f34:f38.07/21[11:38:41.343](197601) WARN Physical Disk <Disco S:>:
Terminate, error dismounting volume \Device\Harddisk1\Partition1.
Error 5.
f34:f38.07/21[11:38:41.390](197601) INFO Physical Disk <Disco S:>:
DiskCleanup returning final error 0
f34:f38.07/21[11:38:41.390](197601) INFO Physical Disk <Disco S:>:
Terminate, Returning final error 0.
f34:104.07/21[11:38:41.390](197601) WARN Physical Disk <Disco S:>:
[PnP] RemoveDisk: WatchedList is empty
f34:f38.07/21[11:38:41.390](197601) INFO Physical Disk <Disco S:>:
DiskCleanup returning final error 0
7e0:748.07/21[11:39:37.062](197601) INFO [CS] Cluster Service started
- Cluster Node Version 4.3790
7e0:304.07/21[11:39:37.062](197601) ERR [DM] DmInitialize: The hive
was loaded- rollback, unload and reload again
7e0:304.07/21[11:39:37.328](197601) WARN [NM] Failed to open cluster
parameters key, status 2.
7e0:304.07/21[11:39:37.343](197601) INFO [NM] Setting cluster service
media sense policy to enabled.
7e0:3a0.07/21[11:40:09.343](197601) WARN [JOIN] JoinVersion data for
sponsor 1.1.1.1 is invalid, status 1722.
7e0:7ac.07/21[11:40:10.343](197601) WARN [JOIN] JoinVersion data for
sponsor 192.168.0.1 is invalid, status 1722.
7e0:730.07/21[11:40:12.359](197601) WARN [JOIN] Unable to resolve
JoinVersion endpoint for sponsor 192.168.0.250, status 1727.
7e0:6a0.07/21[11:40:12.359](197601) WARN [JOIN] Unable to resolve
JoinVersion endpoint for sponsor N1C1, status 1727.
7e0:730.07/21[11:40:12.359](197601) WARN [JOIN] JoinVersion data for
sponsor 192.168.0.250 is invalid, status 1727.
7e0:6a0.07/21[11:40:12.359](197601) WARN [JOIN] JoinVersion data for
sponsor N1C1 is invalid, status 1727.
7e0:304.07/21[11:40:12.359](197601) ERR [JOIN] Unable to connect to
any sponsor node.
7e0:304.07/21[11:40:12.359](197601) WARN [INIT] Failed to join
cluster, status 53
7e0:304.07/21[11:40:12.359](197601) INFO [FM] Group 01177c8d-da50-4fe9-
a88c-50eaebe98130 preferred owner 1.
7e0:304.07/21[11:40:12.359](197601) INFO [FM] Group 01177c8d-da50-4fe9-
a88c-50eaebe98130 preferred owner 2.
7e0:304.07/21[11:40:12.359](197601) INFO [FM]
FmpAddPossibleNodeToList: Warning, node 1 not found
7e0:304.07/21[11:40:12.359](197601) INFO [FM]
FmpAddPossibleNodeToList: Warning, node 1 not found
7e0:304.07/21[11:40:12.359](197601) INFO [FM]
FmpAddPossibleNodeToList: Warning, node 1 not found
f90:210.07/21[11:40:19.578](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to read (sector 12), error 170.
f90:210.07/21[11:40:20.078](197601) WARN Physical Disk <Disco Q:>:
[DiskArb] Retry arbitration, 4 attempts left
f90:210.07/21[11:40:20.078](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to read (sector 12), error 170.
f90:210.07/21[11:40:22.078](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to write (sector 12), error 170.
f90:210.07/21[11:40:22.578](197601) WARN Physical Disk <Disco Q:>:
[DiskArb] Retry arbitration, 3 attempts left
f90:210.07/21[11:40:22.578](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to read (sector 12), error 170.
f90:210.07/21[11:40:29.593](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to read (sector 12), error 170.
f90:210.07/21[11:40:30.093](197601) WARN Physical Disk <Disco Q:>:
[DiskArb] Retry arbitration, 2 attempts left
f90:210.07/21[11:40:30.093](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to read (sector 12), error 170.
f90:210.07/21[11:40:32.093](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to write (sector 12), error 170.
f90:210.07/21[11:40:32.593](197601) WARN Physical Disk <Disco Q:>:
[DiskArb] Retry arbitration, 1 attempts left
f90:210.07/21[11:40:32.593](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to read (sector 12), error 170.
f90:210.07/21[11:40:34.593](197601) ERR Physical Disk <Disco Q:>:
[DiskArb] Failed to write (sector 12), error 170.
7e0:304.07/21[11:40:34.593](197601) ERR [FM] FmGetQuorumResource
failed, error 170.
7e0:304.07/21[11:40:34.593](197601) ERR [INIT] ClusterForm: Could not
get quorum resource. No fixup attempted. Status = 5086
7e0:304.07/21[11:40:34.593](197601) ERR [INIT] Failed to form
cluster, status 5086.
7e0:304.07/21[11:40:34.609](197601) ERR [CS] ClusterInitialize failed
5086
7e0:304.07/21[11:40:34.609](197601) WARN [INIT] The cluster service is
shutting down.
7e0:304.07/21[11:40:34.609](197601) WARN [FM] Shutdown: Failover
Manager requested to shutdown groups.
7e0:304.07/21[11:40:34.609](197601) WARN [MM] MMLeave is called when
rgp=NULL.
7e0:304.07/21[11:40:34.609](197601) ERR [CS] Service Stopped. exit
code = 5086
f90:090.07/21[11:40:34.828](197601) WARN [RM] Going away, Status = 1,
Shutdown = 0.
f90:090.07/21[11:40:34.828](197601) ERR [RM] Active Resource =
00000000
f90:090.07/21[11:40:34.828](197601) ERR [RM] Resource State is 1, ""
f90:978.07/21[11:40:35.328](197601) WARN Physical Disk <Disco Q:>:
[PnP] RemoveDisk: WatchedList is empty
f90:090.07/21[11:40:35.328](197601) INFO Physical Disk <Disco Q:>:
DiskCleanup returning final error 0
138:260.07/21[11:42:35.000](197601) INFO [CS] Cluster Service started
- Cluster Node Version 4.3790
138:53c.07/21[11:42:35.000](197601) ERR [DM] DmInitialize: The hive
was loaded- rollback, unload and reload again