Quantcast
Channel: High Availability (Clustering) forum
Viewing all 2306 articles
Browse latest View live

Scale Out File Server SMB redirection locking up CSVs

$
0
0

Problem - Physical hosts have HyperV running and a vhdx located in a SOFS CSV (HyperV hosts different than SOFS cluster nodes).  During start up of the VM when SMB redirection occurs or when trying to move CSVs with an active SMB connection between cluster nodes locks up the CSV.  

All physical hosts and VMs are Windows 2012 R2 with updates to ~July 2016
All physical hosts are Cisco C220s with latest OS updates and 1 update behind on firmware
SOFS is a two physical node cluster with SAS connected JBOD
4 CSVs exist, all exhibiting the same issue
SOFS cluster nodes have the below networks:
Mgmt - teamed 10G - no cluster use
cluster0 - single 10G nic - cluster only
cluster1 - single 10G nic - cluster only
SOFS0 - single 10G nic - cluster/client
SOFS1 - single 10G nic - cluster/client (currently set to none for troubleshooting)
Backup - Teamed 10G - no cluster use
LiveMigration - Teamed 10G no cluster use/only network for live migrations
Cluster validation runs clean
When nothing is connected to the CSV shares I can fail CSVs and SOFS role without any errors
Currently each CSV is used by a single HyperV server and has a single vhdx in it.

HyperV host networks
SOFS0 - single 10g nic
SOFS1 - single 10g nic
Backup Team
Mgmt Team
Customer Network Team

I believe both problems are related;
Problem 1)
CSV share is owned by SOFSA
When I boot a VM with a secondary vhdx located in SOFS (OS is in local RAID disk), checking the SMBClient logs on HyperV host and SMBServer logs on SOFS hosts I can see:
HyperV host hits SOFSB.  
HyperV host connects and share is seen as asymmetric/continuous availability transfer.  Witness registration completes.  
SOFSB issues redirect to SOFSA.  
HyperV host gets redirection request and establishes connection to SOFSA (4 event log messages, SMB client reconnect, session reconnect, share reconnect and witness registration). 
At the same second as the previous 4 SMB reconnect messages, but last in sequence. so the 5th message, a message is received to redirect to another cluster node.
HyperV looses session and share during reconnect and SMB Client successfully moved, but no messages on session or share reconnect.
After 59 seconds on the SOFSA I have errors the re-open failed (event id 1016), client session expired
After 60 seconds HyperV registers a request timeout due to no response from server.  Server is responding to TCP but not SMB (event id 30809)
HyperV host then immediately registers a connections to SOFSB for the share, goes through the same redirection sequence to SOFSA (who owns the share).  SMB Client, session reconnect, share reconnect, witness registration successful.
2 seconds later on SOFSA I have a reopened failed, the file is temporarily unavailable (event ID 1016)  I can see the source/destination/share that matches with what is occurring.  Error just continues every 5 seconds.
If I go and try to 'inspect' the drive from HyperV it times out and on SOFSA I get a warning (event ID 30805) client lost its session - Error {Network Name Not Found} - The specified share name can not be found share name \SOFSClusterName\$IPC
Now we just repeat errors client established session to server, lost session to server network name not found server \SOFSClusterName - same session ID in connect/disconnect for each pair of connect/disconnect

Now the great part - 
If I go into failover cluster (FOC) and I try to move the CSV to the other node, the CSV gets stuck in pending offilne.  After a few minutes any other CSVs owned by the same node go into pending offline and hang.  I can reboot and wait 10 minutes for it to finally die and failover or wait 20 for FOC to completely die on both nodes of the cluster.  In the cluster logs, the SOFS node is never fully releasing the CSV to move.  The last message you will see related to teh volume is:
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 4 to 2.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 2. Reson 7; Status 0x0.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 2 to 1.

Normally you see :
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 4 to 2.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 2. Reson 7; Status 0x0.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 2 to 1.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 1. Reson 5; Status 0x0.
Volume4; Volume target path \??\GLOBALROOT\Device\Harddisk39\ClusterPartition1; File System target path \??\GLOBALROOT\Device\Harddisk39\ClusterPartition1.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 1 to SetDownlevel. Local true; Flags 0x1; CountersName
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 3. Reson 3; Status 0x0.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 3 to 4.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 4. Reson 4; Status 0x0.

Issue is consistent across all 4 CSVs I have.  I believe the issue has always existed.  If I get the HyperV hosts lined up right to initially hit the SOFS server that owns the CSV, everything boots up fine.  When it doesn't VMs and FOC hangs and I have to go through reboots and VMs loose their drives and I have to reboot those as well. It only when it gets redirected to a different SOFS server that the issue comes up which leads me to the next problem.

Problem2: 
Assuming all the VMs connected to the right SOFS CSV owner on boot and everyone is running/working fine for days/weeks/months (yes this has been sitting around for a while as unresolved problem).  If I try and move a CSV for SOFS maintenance purposes the CSV hangs in offline pending.  Eventually the FOC hangs and I have to spend 2 hours to get things lined up right (after I do what ever I was planning on doing) so the VMs boot.

Things done/verified
Windows firewall is off
I've turned off IPv6
Removed Teaming from all nodes using SOFS0/1 network and cluster0/1 (used to be windows team vs individual networks)
Turned off client/network access from SOFS1 network
turned off CSV balancer - hindsight doesn't work without it due to redirection of CSVs due to asymentic storage
updated permissions for SOFS share to include HyperV host, SOFS cluster nodes - didn't make any difference/never see access denied errors

One item I see I don't understand is on the SOFS cluster nodes, in SMBClient/connectivity logs, I see network connection failed to the cluster adddresses:

The network connection failed.
Error: {Device Timeout}
The specified I/O operation on %hs was not completed before the time-out period expired.
Server name: fe80::98f9:c138:xxxxx%32
Server address: x.x.x.x:445
Connection type: Wsk
Guidance:
This indicates a problem with the underlying network or transport, such as with TCP/IP, and not with SMB. A firewall that blocks port 445 or 5445 can also cause this issue.

The server name is the 'Tunnel adapter Local Area Connection* 12:' on the other SOSF cluster node.  So SOFSA generating errors to SOFSB and SOFSB generating errors connecting to SOFSA.   This was occuring before and after the cluster0/1 network interfaces were teamed



Thanks-









"Unable to successfully cleanup" Error while configuring cluster in Windows 2012 R2 server.

$
0
0

I am configuring  a three node Windows 2012 R2 server Failover Cluster. The cluster validation completes without any error/warning. please help to fix this error

Cluster:
hcledu
Node:
clstr-1.hcltrg.com
Node:
clstr-3.hcltrg.com
Node:
clstr-2.hcltrg.com
IP Address:
192.168.10.200
Started
10/19/2016 4:10:38 PM
Completed
10/19/2016 4:14:16 PM
Beginning to configure the cluster hcledu.
Initializing Cluster hcledu.
Validating cluster state on node clstr-1.hcltrg.com.
Find a suitable domain controller for node clstr-1.hcltrg.com.
Searching the domain for computer object 'hcledu'.
Bind to domain controller \\HCLTRG.hcltrg.com.
Check whether the computer object hcledu for node clstr-1.hcltrg.com exists in the domain. Domain controller \\HCLTRG.hcltrg.com.
Computer object for node clstr-1.hcltrg.com exists in the domain.
Verifying computer object 'hcledu' in the domain.
Checking for account information for the computer object in the 'UserAccountControl' flag for CN=hcledu,CN=Computers,DC=hcltrg,DC=com.
Enable computer object hcledu on domain controller \\HCLTRG.hcltrg.com.
Configuring computer object 'hcledu in organizational unit CN=Computers,DC=hcltrg,DC=com' as cluster name object.
Get GUID of computer object with FQDN: CN=hcledu,CN=Computers,DC=hcltrg,DC=com
Validating installation of the Network FT Driver on node clstr-1.hcltrg.com.
Validating installation of the Cluster Disk Driver on node clstr-1.hcltrg.com.
Configuring Cluster Service on node clstr-1.hcltrg.com.
Validating installation of the Network FT Driver on node clstr-3.hcltrg.com.
Validating installation of the Cluster Disk Driver on node clstr-3.hcltrg.com.
Configuring Cluster Service on node clstr-3.hcltrg.com.
Validating installation of the Network FT Driver on node clstr-2.hcltrg.com.
Validating installation of the Cluster Disk Driver on node clstr-2.hcltrg.com.
Configuring Cluster Service on node clstr-2.hcltrg.com.
Waiting for notification that Cluster service on node clstr-1.hcltrg.com has started.
Forming cluster 'hcledu'.
Unable to successfully cleanup.
An error occurred while creating the cluster and the nodes will be cleaned up. Please wait...
An error occurred while creating the cluster and the nodes will be cleaned up. Please wait...
There was an error cleaning up the cluster nodes. Use Clear-ClusterNode to manually clean up the nodes.
There was an error cleaning up the cluster nodes. Use Clear-ClusterNode to manually clean up the nodes.
There was an error cleaning up the cluster nodes. Use Clear-ClusterNode to manually clean up the nodes.
An error occurred while creating the cluster.
An error occurred creating cluster 'hcledu'.

This operation returned because the timeout period expired
To troubleshoot cluster creation problems, run the Validate a Configuration wizard on the servers you want to cluster.

HyperV Cluster MAC Address static or dynamic

$
0
0

Hi,

i have a hyperv cluster with 9 Windows 2012R2 Nodes and round about 200 VMs with 2012R2 or 2008R2.
alle VMs where connect to 2 virtual switches. at the moment i have dynamic mac addresses.

what is the better way? should i change to static mac addresses?

best regards
Thomas 


Thomas Lauer

SMB Access denied for Cluster Role Resource

$
0
0

Dear All,

   I have Window 2008 R2 File Server Fail over cluster which is having in Production. As part of DR fail-over test i have created another stand alone Windows 2008 R2 Server with File Server role enabled. 

currently File Server disk (Disk) replication to DR with 3rd party product and during fail-over productioncluster role offline and attaching production disk to DR stand alone machine

Once disk attached to the DR host then changing the "Cluster Role - DNS "A" record IP Address pointing to DR Server .

when the users are trying to access the user Home folder or shared folder user getting access denied error. tried the \\DNS and FQDN (the access denied error. )

when i login to any workstation or Server with local administrator try to access same SMB using \\DNS and FQDN name it's working fine.

Any idea? 

Unable to connect to cluster

$
0
0

Hi Team,

I am not able to connect to cluster in windows 2012 server getting error message as below.




Sivakumar Thayumanavan

Unable to set NodeAndFileShareMajority quorum setting

$
0
0

Set-ClusterQuorum : There was an error configuring the file share witness '\\server\SharedFolder

Unable to save property changes for 'File Share Witness'.
    The user name or password is incorrect
At line:1 char:1
+ Set-ClusterQuorum -NodeAndFileShareMajority "\\server\SharedFolder'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Set-ClusterQuorum], ClusterCmdletException
    + FullyQualifiedErrorId : Set-ClusterQuorum,Microsoft.FailoverClusters.PowerShell.SetClusterQuorumCommand

On this SharedFolder, I have given full permission to ClusterName and cluster nodes.

Let me know if you have any query.

Thanks for your help in advance!!!

disappearing iSCSI Targets

$
0
0

I'm new to clustering.

I have two identical computers.  Same box, same model, each with 2 NIC's, same hard drives, same memory etc . . . . .

I installed Win-Server 2012 R2 - Standard on each box, then setup identical iSCSI drives and Targets on each.  One 50 GB iSCSI drive for the witness, and one 500 GB drive for data - identical drives on each box.  These servers are NOT setup as virtual machines.

The iSCSI Initiator on each box finds its iSCSI drives, and those on the other box. 

Computer Management/Drive Management on each box sees all four drives on each box.

The Failover Cluster Validation passes every single test.  The "Create a Failover Cluster from the Tested Hardware" is checked and run, and the "Add all available storage" is checked.

After the Cluster is created, all iSCSI targets and their virtual drives are gone!!!  They don't show up anywhere - except within the iSCSI-Virtual Drives folder on the primordial disk.

Does anyone have suggestions???

Lew

Node Disk Management

$
0
0

We have a 3TB iSCSI Lun assigned to our Exchange server in our cluster and in Node Disk Management it shows as 2 drives. A 2048GB CSVFS Primary Partition and 1024GB as unallocated. Of which we cannot make any changes to the unallocated partition.

But on the Exchange server that is running as a VM on this LUN. We have 3 partitions. What I am concerned about is what will happen when we exceed 2TB on one of the partitions. Is the Node Disk Management display of 2 partition's something I can ignore. Or should I look at moving my VM off of this LUN an creating a new one and verifying that I shows all 3TB before. Moving it back?

 


drive letter missing

$
0
0
 

I have a two-node 2008 SP1 Failover Cluster.

I got a problem today:

Drive letter of a cluster volume is lost when I move the cluster disk to another node.

 

“Cluster Disk 1” had one volume with drive letter “R:”. I assigned another drive letter “F:” to it on node 1.

Then I moved Cluster Disk 1 to node 2.

Its drive letter was gone!

I tried to assign a new drive letter to it. But it was gone every time I moved it to another node.

 

It never happens before.

Could anyone please tell me why it happens and how to fix it?

 

By the way, I installed latest windows updates yesterday.

Guest VM simultaneous failover

$
0
0

Hi,

It is a requirement within our environment for certain guest VM's to always be located on the same node of a cluster as each other, so if one is migrated off, the other moves with it. Essentially they need to be "paired".

Can someone please advise on how I can do this?

Regards

Leon

Cluster VMs sometime fail while doing an export-vm

$
0
0

I'm using a powershell script to export some Clustered (2012 R2 Hyper-V) VMs through task scheduler. 

Every now and then a VM is restarted by cluster during the export-vm. The errors found from the event viewer are located in the end of this message. I cannot see a proper cause for the failure, is there anyway to debug this problem more deeply?

I would also like to know if there is some switch I could set on the cluster-resource, while doing the export-vm, to prevent cluster from trying to restart the VM, even if it is not responding for a while during the export-vm.

The powershell script used:

$VMS = get-vm -Name VM1,VM2,VM3 -EA SilentlyContinue
    foreach ($VM in $VMS.vmname) {
   del \\fileserver\HyperVexport\$VM -force -recurse
        Export-VM -Name $VM -Path \\fileserver\HyperVexport
if ( $? -ne "True" )
{
$date = get-date -format s
"$date $VM Export failed" | out-file -FilePath c:\hyper-v\scripts\ExportVMs.log -Append
send-mailmessage -from "xxx@xx.xx" -to "xxx@xx.xx" -subject "Export of $VM in $env:COMPUTERNAME failed" -smtpServer mailserver
}
    } #close foreach

Event logs from the Hyper-V host where the VM is running at the time of the failure:

TimeLogEvent-IDDescription
22:56:07Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state Online to state ProcessingFailure.
22:56:07Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state ProcessingFailure to state WaitingToTerminate. Cluster resource 'Virtual Machine VM1' is waiting on the following resources: .
22:56:07Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state WaitingToTerminate to state Terminating.
22:56:07Windows Logs/System1069Cluster resource 'Virtual Machine VM1' of type 'Virtual Machine' in clustered role 'VM1' failed. Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
22:57:07Applications and Services Logs/Microsoft/Windows/Hyper-V-High-Availability/Admin21128Virtual Machine VM1' failed to shutdown the virtual machine during the resource termination. The virtual machine will be forcefully stopped.
22:57:07Applications and Services Logs/Microsoft/Windows/Hyper-V-High-Availability/Admin21119Virtual Machine VM1' succesfully started the virtual machine during the resource termination. The virtual machine.
22:57:13Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state Terminating to state DelayRestartingResource.
22:57:13Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state DelayRestartingResource to state OnlineCallIssued.
22:57:13Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state OnlineCallIssued to state OnlinePending.
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin14070 Virtual machine 'VM1' (ID=9510686F-BE3C-4CAA-99A5-EB756ED8DED1) has quit unexpectedly.
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin15190VM1' failed to take a checkpoint. (Virtual machine ID 9510686F-BE3C-4CAA-99A5-EB756ED8DED1)
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin15140VM1' failed to turn off. (Virtual machine ID 9510686F-BE3C-4CAA-99A5-EB756ED8DED1)
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin18350Export failed for virtual machine 'VM1' (9510686F-BE3C-4CAA-99A5-EB756ED8DED1) with error 'The process terminated unexpectedly.' (0x8007042B).
22:57:17Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state OnlinePending to state Online.
22:57:17Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1201The Cluster service successfully brought the clustered role 'VM1' online.


In the VM1 event-viewer I can only see the "The previous system shutdown at ... was unexpected", so it was forcefully shutdown as can be seen from the logs above.
 

How can I change a clustered disk Volume Id

$
0
0

Hi,

Prior to clustering I have been rapid provisioning virtual machines using SAN disk cloning tools.

As all the disks are cloned from the same master disk, the volumes on the disk will share the same VolumeID. This wasn't a problem as Windows would automatically detect the duplicate disk and leave it in offline. When the disk was brought online the VolumeID would be changed automatically.

However with a cluster, the clashing volume might be owned by another node in the cluster and windows would not detect the duplicate VolumeId.

The duplicate would be picked up however, when I try to import the disk into the cluster and disk would remain in "failed" state.

The only way to fix it would be to locate the node that owns the volume with the identical volume id and bring it online on that node prior to importing the disk into the cluster. This would force the VolumeId change.

I have tried to change the win32_volume deviceid property in powershell but it's read only.

How can I change the volumeid using powershell. There is an app available by Mark Russinovich to change volumeid however that need the disk to have a drive letter and I don't mount the volumes

Thanks

Daniel

 

 

 

 

WS2012r2 two node failover cluster

$
0
0

WS2012r2 two node failover cluster
Node01 and Node02 have seven 40 gb dynamic hyper-v SCSI controller hard drive disk(vhdx) with enabled virtual hard disk sharing  saved to F: \VHDs location. This location is a Storage pool volume HostClusterVD assigned drive letter F. This is a storage pool volume both Node01 and Node02 share the seven disk. Cluster validation is run using domain user and validated. Failover cluster manager cluster name is given and cluster is created. I can see the disk in disk manager and file server manager but not in the Failover cluster manager new storage pool wizard disk are not visible for scale out file server. I have read a lot of documents but no solution .  What is the solution to this for this particular problem?

Thank you


rejoining to domain clustered node

$
0
0

Hi,

could I rejoin clustered node to domain?

thanks,

Performance AlwaysOn SQL 2014 in synchronous mode

$
0
0

Hello,

I have found a great article about AlwaysOn performance in synchronous mode:

_https://blogs.msdn.microsoft.com/sambetts/2014/08/21/alwayson-asynchronous-vs-synchronous-commit-performance-for-sharepoint/

The short version of this artice is: updates are about x2 slower than standalone for a x2 node AlwaysOn cluster; reading data is about the same performance (which would make sense).

As we currently create an AlwaysOn Cluster SQL 2014 with one primary synchronous replica if have also done some performance tests. For this I have used a scripts which inserts 20000 rows. Please find the script at the end.

My results: (Script see below)
Single Database (standalone, not in Availabilty Group): 9seconds
Synchronized Database (AlwaysOn, synchronous mode): 19seconds

It would be very interesting for me if you get similar results and what you think about this: Is this high latency really a normal behavior, means by design?

CREATE TABLE dbo.TestTableSize
(
 MyKeyField VARCHAR(10) NOT NULL,
 MyDate1 DATETIME NOT NULL,
 MyDate2 DATETIME NOT NULL,
 MyDate3 DATETIME NOT NULL,
 MyDate4 DATETIME NOT NULL,
 MyDate5 DATETIME NOT NULL
)

DECLARE @RowCount INT
DECLARE @RowString VARCHAR(10)
DECLARE @Random INT
DECLARE @Upper INT
DECLARE @Lower INT
DECLARE @InsertDate DATETIME

SET @Lower = -730
SET @Upper = -1
SET @RowCount = 0

WHILE @RowCount < 20000
BEGIN
 SET @RowString = CAST(@RowCount AS VARCHAR(10))
 SELECT @Random = ROUND(((@Upper - @Lower -1) * RAND() + @Lower), 0)
 SET @InsertDate = DATEADD(dd, @Random, GETDATE())
 
 INSERT INTO TestTableSize
  (MyKeyField
  ,MyDate1
  ,MyDate2
  ,MyDate3
  ,MyDate4
  ,MyDate5)
 VALUES
  (REPLICATE('0', 10 - DATALENGTH(@RowString)) + @RowString
  , @InsertDate
  ,DATEADD(dd, 1, @InsertDate)
  ,DATEADD(dd, 2, @InsertDate)
  ,DATEADD(dd, 3, @InsertDate)
  ,DATEADD(dd, 4, @InsertDate))

 SET @RowCount = @RowCount + 1
END



Question about Cluster 3-Nodes

$
0
0

Hi All

We are creating a cluster for a customer so we can set up SQL Availability Groups. My scenario is this:

Site 1 = 2 nodes

site 2 = 1 Node but i will set this up for no votes because its A DR site and i don't want it to become primary unless i manually failover. 

Site 3 = Witness file share

My question is this.

1)Do i need the witness in my current configuration of 3 nodes if i put site 2 as no votes. 

2)Should site 2 only have 1 node or should i have 2 nodes both with no votes?

Thanks and any other recommendation or tips would be greatly appreciated

Exchange 2013 (On 2008 Server R2) - Cluster issues when quorum lost

$
0
0

9 Exchange 2013 mailbox servers all running on Windows 2008 Server R2. All part of the same DAG (so just one cluster).

Inadvertently, powerer off 5 of these mailbox servers - bad idea. According to my reckoning, due to odd number of members, the quorum should be relying on a majority count of members. Since it no lonfer has a majority then the databases should be dismounted on the remaining servers.

This didn't happen. The exchange databases stayed up - albeit the CAS servers went to 100%.

We then powered back up the 5 servers but the cluster did not recover on its own. On each of the 4 servers that did not power down the cluster status still showed down. A restart of the windows cluster service on each of the 4 servers sorted this out.

Question is :-

1. Any idea why exchange did not dismount DBs when quorum was lost?

2. Why did cluster not recover when the servers were back?

Storage Spaces Direct Windows Server 2016 Lab Testing physical disks issue

$
0
0

alright folks , first of all thanks for the help if i get a answer.

we are always on the look to test out some new stuff and this time we want to test Storage Spaces Direct,

in case it proves to be good , we can implement this in production.

for testing purpose i use 2 HP DL380 Gen 9 servers , i had some problems getting the disks to be recognized as SAS in Windows Server , but after a bit of google time i found out that after deleting all of the RAID config's on the HP P440ar and putting it into HBA Mode , it is possible.

then , after upgrading the server with the latest SPP i am now able to see the drives come up as SAS, so finally i thought i was able to start using S2D , but no , thats not the case.

at this point the 2 servers are identical , and they both have 2 SAS HDD 10K 146GB drives and 4 SAS SSD 460GB drives in there.

i installed windows server 2016 in UEFI mode on one of the 460GB drives , you could ask me why but i had some issue's installing and i just wanted a working system to test out S2D , i didn't cared about the storage at that point because after the Lab i'm erasing the configuration.

at this point , all i am trying to do is execute the command Enable-ClusterStorageSpacesDirect but i get a error straight away , saying S2D is not supported on my system , i verified the drives are shown as SAS in server manager , but when i run the cluster validation test , and specifically the storage spaces direct tests , i get the following error on one of the drives :

Disk is a boot volume. Disk is a system volume. Disk is used for paging files. Disk partition style is GPT. Disk has a System Partition. Cannot cluster a disk with a System Partition. Disk has a Microsoft Reserved Partition. Disk has a Basic Data Partition. Cannot cluster a disk with a Basic Data Partition. Disk has a Microsoft Recovery Partition. Disk type is DYNAMIC.

this error makes sense , since i had to install my OS on once of the drives , but what does that mean at point of configuration , did i made a mistake or can i exclude one of the disks ... i could use some help understanding this a bit better.

Sql Server 2014

$
0
0
Can we configure active passive clustering on sql server 2014 without any licence?

Validate Network Communication Error on UDP port 3343

$
0
0

    Both side firewall is off but still i get this error. Server1 is windows server 2016 and server2 is windows server 2012 R2. even i added allow connection rule in advanced firewall setting as outbound and inbound.

    "Network interfaces Server1 - vEthernet (Live) and Server2 - vEthernet (Live) are on the same cluster network, yet address 172.16.0.1 is not reachable from 172.16.0.5 using UDP on port 3343."

Viewing all 2306 articles
Browse latest View live