Quantcast
Channel: High Availability (Clustering) forum
Viewing all 2306 articles
Browse latest View live

Intermittent Live Migration failure generating Event ID 21502, 22038, 21111, 21024

$
0
0

We have a multi node Hyper V Cluster that has recently developed an issue with intermittent failure of live migrations.

We noticed this when one of our CAU runs failed because it could not place the Hosts into maintenance mode or successfully drain all the roles from them.

Scenario:

Place any node into Maintenance mode/drain roles.

Most VM's will drain and live migrate across onto other nodes.  Randomly one or a few will refuse to move (it always varies in regards to the VM and which node it is moving to or from).  The live migration ends with a failure generating event ID's 21502, 22038, 21111, 21024.  If you run the process again (drain roles) it will migrate the VM's or if you manually live migrate them they will move just fine.  Manually live migrating a VM can result in the same intermittent error but rerunning the process will succeed after one or two times or just waiting for a couple minutes.

This occurs on all Nodes in the cluster and can occur with seemingly any VM in the private cloud.

Pertinent content of the event ID's is:

Event 21502
Live migration of 'VM' failed.

Virtual machine migration operation for 'VM' failed at migration source 'NodeName'. (Virtual machine ID xxx)

Failed to send data for a Virtual Machine migration: The process cannot access the file because it is being used by another process. (0x80070020).

Event 22038
Failed to send data for a Virtual Machine migration: The process cannot access the file because it is being used by another process. (0x80070020).

According to this it would appear that something is locking the files or they are not transferring permissions properly, however all access to the back end SOFS is uniform across all the Nodes and the failure is intermittent rather than consistently happening on one Node. 

Thanks in advance!


Error 13 from ResourceControl for resource Disk Drive while adding cluster disk

$
0
0

Hi,

I have a drive mounted at C:\mountpoint\Kdrive. C:\ is not a cluster disk. I am trying to use Cluster API to add this disk to the cluster but it fails with the following errors:-

00000928.00000ce8::2016/03/10-04:59:51.637 INFO  [RCM] rcm::RcmApi::CreateResource: (SQL Server (MSSQLSERVER), Disk Drive C:\mountpoint\KDrive\, 8836dfef-fa51-419d-960f-75965fed6cfd, Physical Disk)
00000928.00000ce8::2016/03/10-04:59:51.637 INFO  [RCM] rcm::RcmGum::CreateResource(Disk Drive C:\mountpoint\KDrive\,8836dfef-fa51-419d-960f-75965fed6cfd,SQL Server (MSSQLSERVER))
00000304.00000554::2016/03/10-04:59:51.678 ERR   [RES] Physical Disk <Disk Drive C:\mountpoint\KDrive\>: Open: Unable to get disk identifier. Error: 5023.
00000928.00000dc8::2016/03/10-04:59:51.678 INFO  [RCM] HandleMonitorReply: OPENRESOURCE for 'Disk Drive C:\mountpoint\KDrive\', gen(0) result 0.
00000304.00000554::2016/03/10-05:00:12.208 ERR   [RHS] Error 13 from ResourceControl for resource Disk Drive C:\mountpoint\KDrive\.
00000928.00000ce8::2016/03/10-05:00:12.208 WARN  [RCM] ResourceControl(SET_PRIVATE_PROPERTIES) to Disk Drive C:\mountpoint\KDrive\ returned 13

I tried with various syntax for the Disk Drive path (with single \ and double \\) but nothing works. If I execute the same code with path like K:\ it works fine.

Code snippet:

try
 {
  // Create the resource.  The resource name is "Disk Drive @:"
  // where @ is the drive letter of a disk partition.
  bstr_t bstr;
  UTIL_Utf8ToWideChar (szDiskPath.data(), bstr);
  int length = bstr.length ();
  lpstrDiskPathW = new WCHAR[length + 1];
  wcsncpy (lpstrDiskPathW, (const wchar_t*)bstr, length);
  lpstrDiskPathW[length] = L'\0';

  String strResName = "Disk Drive " + szDiskPath;
  UTIL_Utf8ToWideChar(strResName.data(), bstr);
  length = bstr.length ();
  lpstrResourceNameW = new WCHAR[length + 1];
  wcsncpy (lpstrResourceNameW, (const wchar_t*)bstr, length);
  lpstrResourceNameW[length] = L'\0';

  hResource = m_funcCreateClusterResource(hClusterGroup,
   (LPCWSTR)lpstrResourceNameW,
   L"Physical Disk",
   0);

  if( hResource == NULL )
  {
   m_log.error("CreateDiskResource: failed to create disk resource %s", strResName);
   throw -1;
  }
  else
  {
   m_log.info("CreateDiskResource: created disk resource %s", strResName);
  }

  // Set the diskpath private property
  // Begin property list used to set the DiskPath private property.
  WCHAR szPropName[] = CLUSREG_NAME_PHYSDISK_DISKPATH;

  typedef struct _DiskPathControl
  {
   DWORD dwPropCount;
   CLUSPROP_PROPERTY_NAME_DECLARE(PropName,sizeof(szPropName)/sizeof(WCHAR));
   CLUSPROP_SZ_DECLARE(DiskPathValue, sizeof(lpstrDiskPathW)/sizeof(WCHAR));
   CLUSPROP_SYNTAX Endmark;
  } DiskPathControl;

  DiskPathControl DPC;

  //  Property Count
  DPC.dwPropCount = 1;

  //  Property Name
  DPC.PropName.Syntax.dw  = CLUSPROP_SYNTAX_NAME;
  DPC.PropName.cbLength   = sizeof( szPropName );
  wcsncpy (DPC.PropName.sz, (const wchar_t*)szPropName, DPC.PropName.cbLength);

  //  Property Value
  DPC.DiskPathValue.Syntax.dw = CLUSPROP_SYNTAX_LIST_VALUE_SZ;
  DPC.DiskPathValue.cbLength  = sizeof( lpstrDiskPathW );
  wcsncpy (DPC.DiskPathValue.sz, (const wchar_t*)lpstrDiskPathW, DPC.DiskPathValue.cbLength);

  //  Endmark
  DPC.Endmark.dw = CLUSPROP_SYNTAX_ENDMARK;

  DWORD cbSize = sizeof( DiskPathControl );

  //  End property list creation

  // Set the diskpath private property
  dwRC = m_funcClusterResourceControl( hResource,
   NULL,
   CLUSCTL_RESOURCE_SET_PRIVATE_PROPERTIES,
   ( void* ) &DPC,
   cbSize,
   NULL,
   0,
   NULL );

  if( dwRC != ERROR_SUCCESS )
  {
   String err(dwRC);
   m_log.error("AA_ClusterBase:: CreateDiskResource: failed to set the DiskPath property, error %s", err);
   m_funcDeleteClusterResource( hResource );
   m_funcCloseClusterResource( hResource );
   hResource = NULL;
   throw -1;
  }
 }
 catch (...)
 {
 }

Is there a know limitation with the Cluster API for not supporting disks mounted on mountpoints?

BTW this works fine:

C:\>cluster res “Disk W:\Mount” /priv DiskPath=”W:\Mount”

Thanks,

Aditya

Guest VM simultaneous failover

$
0
0

Hi,

It is a requirement within our environment for certain guest VM's to always be located on the same node of a cluster as each other, so if one is migrated off, the other moves with it. Essentially they need to be "paired".

Can someone please advise on how I can do this?

Regards

Leon

Cluster VMs sometime fail while doing an export-vm

$
0
0

I'm using a powershell script to export some Clustered (2012 R2 Hyper-V) VMs through task scheduler. 

Every now and then a VM is restarted by cluster during the export-vm. The errors found from the event viewer are located in the end of this message. I cannot see a proper cause for the failure, is there anyway to debug this problem more deeply?

I would also like to know if there is some switch I could set on the cluster-resource, while doing the export-vm, to prevent cluster from trying to restart the VM, even if it is not responding for a while during the export-vm.

The powershell script used:

$VMS = get-vm -Name VM1,VM2,VM3 -EA SilentlyContinue
    foreach ($VM in $VMS.vmname) {
   del \\fileserver\HyperVexport\$VM -force -recurse
        Export-VM -Name $VM -Path \\fileserver\HyperVexport
if ( $? -ne "True" )
{
$date = get-date -format s
"$date $VM Export failed" | out-file -FilePath c:\hyper-v\scripts\ExportVMs.log -Append
send-mailmessage -from "xxx@xx.xx" -to "xxx@xx.xx" -subject "Export of $VM in $env:COMPUTERNAME failed" -smtpServer mailserver
}
    } #close foreach

Event logs from the Hyper-V host where the VM is running at the time of the failure:

TimeLogEvent-IDDescription
22:56:07Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state Online to state ProcessingFailure.
22:56:07Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state ProcessingFailure to state WaitingToTerminate. Cluster resource 'Virtual Machine VM1' is waiting on the following resources: .
22:56:07Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state WaitingToTerminate to state Terminating.
22:56:07Windows Logs/System1069Cluster resource 'Virtual Machine VM1' of type 'Virtual Machine' in clustered role 'VM1' failed. Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
22:57:07Applications and Services Logs/Microsoft/Windows/Hyper-V-High-Availability/Admin21128Virtual Machine VM1' failed to shutdown the virtual machine during the resource termination. The virtual machine will be forcefully stopped.
22:57:07Applications and Services Logs/Microsoft/Windows/Hyper-V-High-Availability/Admin21119Virtual Machine VM1' succesfully started the virtual machine during the resource termination. The virtual machine.
22:57:13Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state Terminating to state DelayRestartingResource.
22:57:13Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state DelayRestartingResource to state OnlineCallIssued.
22:57:13Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state OnlineCallIssued to state OnlinePending.
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin14070 Virtual machine 'VM1' (ID=9510686F-BE3C-4CAA-99A5-EB756ED8DED1) has quit unexpectedly.
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin15190VM1' failed to take a checkpoint. (Virtual machine ID 9510686F-BE3C-4CAA-99A5-EB756ED8DED1)
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin15140VM1' failed to turn off. (Virtual machine ID 9510686F-BE3C-4CAA-99A5-EB756ED8DED1)
22:57:13Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin18350Export failed for virtual machine 'VM1' (9510686F-BE3C-4CAA-99A5-EB756ED8DED1) with error 'The process terminated unexpectedly.' (0x8007042B).
22:57:17Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1637Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state OnlinePending to state Online.
22:57:17Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational1201The Cluster service successfully brought the clustered role 'VM1' online.


In the VM1 event-viewer I can only see the "The previous system shutdown at ... was unexpected", so it was forcefully shutdown as can be seen from the logs above.
 

How can I change a clustered disk Volume Id

$
0
0

Hi,

Prior to clustering I have been rapid provisioning virtual machines using SAN disk cloning tools.

As all the disks are cloned from the same master disk, the volumes on the disk will share the same VolumeID. This wasn't a problem as Windows would automatically detect the duplicate disk and leave it in offline. When the disk was brought online the VolumeID would be changed automatically.

However with a cluster, the clashing volume might be owned by another node in the cluster and windows would not detect the duplicate VolumeId.

The duplicate would be picked up however, when I try to import the disk into the cluster and disk would remain in "failed" state.

The only way to fix it would be to locate the node that owns the volume with the identical volume id and bring it online on that node prior to importing the disk into the cluster. This would force the VolumeId change.

I have tried to change the win32_volume deviceid property in powershell but it's read only.

How can I change the volumeid using powershell. There is an app available by Mark Russinovich to change volumeid however that need the disk to have a drive letter and I don't mount the volumes

Thanks

Daniel

 

 

 

 

Hyper-V Failover Cluster - VM's from a Host A migrated to Host B after the shutdown of Host A but it restarted the VM's and not retain the current state

$
0
0

Hi Team,

We have a Hyper-V Failover Cluster with the below setup:

Host A

Host B

Host C

Host D

Scenarios:

1. During the shutdown of Host A, the some VM's migrated to Host B successfully

2. Some VM's migrated but it restarted and not retained to current State

Can we ask if what's the possible cause of the migration issues? Thanks.

Node failed to join cluster - Cluster group could not be found

$
0
0

Hi,

I'm having problems joining a Hyper-V cluster node to the cluster, after a reboot. It gives me a critical error in the event log: EventID 1070 - The node failed to join the failover cluster <clustername> due to error code 5013. Error code 5013 seems to be 'cluster group could not be found'.

I have googled, but didn't find any useful answers. Perhaps some of you know how to troubleshoot this issue? Thanks...

Cluster network degradation

$
0
0

Hi all, I'm just wondering if anyone else has experienced this.

We have a 3-node 2012R2 cluster, all nodes running core. It's fine, the only test it seems to fail is the cluster communication networks all being on the same subnet- if I set them as the same then the cluster test complains, if I set them as separate then the Dell HITKit complains so it's lose-lose.

The cluster will tick along nicely, we can move things around etc. Then, after some time (?), event log errors start popping up. These can be (sorry, the list is long): 1038, 1069, 1126,1127,1129,1135,1137,1146,1155,1205,1254,5120,5142.

We don't necessarily get all of these error and am not sure which ones crop up first, but it seems like the host networking gets...clogged up? That sounds daft, but if we reboot (drain) the hosts then the problem is resolved, and the cluster carries on for however long.

Microsoft have, in the past, suggested settings to switch off (TCP chimney's etc, a bunch of stuff) on each host NIC and the Dell HITKit is installed on anything directly accessing EqualLogic volumes. We patch the hosts, run the Dell SUU CD against them once in a while to keep drivers, firmware up to date etc.

I'd be grateful for any help- like I said, however daft it sounds it just seems like the networking gets clogged up with data after a while so the adapters freeze up.

?


High availibility with hyper-V

$
0
0

 Hi,

our customer uses hyper-V VM (on Linux) on a primary site, and a multi site cluster to allow disaster recovery plan.

He plans to install an IBM product (based on WebSphere appplication server).

My question:

is it necessary to install the product on both sites ? how does the hyper-V cluster work ? Does it make an automatic synchronization of VM data or is it necessary to trigger  the switch manually ?

Can a VM clone or replication be done ? But in this case i think it is necessary ip and it can impact application configuration.

Thanks a lot for your information about this subject

MPIO with Windows Failover Cluster

$
0
0

Hi Team,

I am looking some pointers on configuring MPIO for Windows Failover cluster with shared storage. Is it mandatory to install MPIO feature before or with Windows server failover cluster with shared storage on Windows level or is it something that can be taken care at the Storage level? Any article that could help with configuring Windows Server MPIO feature will be appreciated. Thanks

Regards, 

how to prep a Win2k12 R2 server for SQL Server 2014 clustering?

$
0
0
we are preparing to implement high-availability or clustering for SQL Server 2014 Enterprise Edition with SP2. is there some sort of guidelines how to prep a Windows Server 2012 R2 to support it?

Validate SCSI-3 Persistent Reservation failed during Cluster validation

$
0
0

Hi Team,

We have 'Validate SCSI-3 Persistent Reservation' check failed during Cluster validation.

Cluster nodes are running on VMware virtualization platform and shared storage is from EMC storage mapped as pass through LUN directly with Cluster node VMs. Any one any pointers pls. what could be the issues here. Is there anything that need to be corrected or need to check in Windows Server or more related to VMWare and storage related issue? Any pointers will be appreciated. Thanks

Regards,

Intermittent Live Migration failure generating Event ID 21502, 22038, 21111, 21024

$
0
0

We have a multi node Hyper V Cluster that has recently developed an issue with intermittent failure of live migrations.

We noticed this when one of our CAU runs failed because it could not place the Hosts into maintenance mode or successfully drain all the roles from them.

Scenario:

Place any node into Maintenance mode/drain roles.

Most VM's will drain and live migrate across onto other nodes.  Randomly one or a few will refuse to move (it always varies in regards to the VM and which node it is moving to or from).  The live migration ends with a failure generating event ID's 21502, 22038, 21111, 21024.  If you run the process again (drain roles) it will migrate the VM's or if you manually live migrate them they will move just fine.  Manually live migrating a VM can result in the same intermittent error but rerunning the process will succeed after one or two times or just waiting for a couple minutes.

This occurs on all Nodes in the cluster and can occur with seemingly any VM in the private cloud.

Pertinent content of the event ID's is:

Event 21502
Live migration of 'VM' failed.

Virtual machine migration operation for 'VM' failed at migration source 'NodeName'. (Virtual machine ID xxx)

Failed to send data for a Virtual Machine migration: The process cannot access the file because it is being used by another process. (0x80070020).

Event 22038
Failed to send data for a Virtual Machine migration: The process cannot access the file because it is being used by another process. (0x80070020).

According to this it would appear that something is locking the files or they are not transferring permissions properly, however all access to the back end SOFS is uniform across all the Nodes and the failure is intermittent rather than consistently happening on one Node. 

Thanks in advance!

Error 13 from ResourceControl for resource Disk Drive while adding cluster disk

$
0
0

Hi,

I have a drive mounted at C:\mountpoint\Kdrive. C:\ is not a cluster disk. I am trying to use Cluster API to add this disk to the cluster but it fails with the following errors:-

00000928.00000ce8::2016/03/10-04:59:51.637 INFO  [RCM] rcm::RcmApi::CreateResource: (SQL Server (MSSQLSERVER), Disk Drive C:\mountpoint\KDrive\, 8836dfef-fa51-419d-960f-75965fed6cfd, Physical Disk)
00000928.00000ce8::2016/03/10-04:59:51.637 INFO  [RCM] rcm::RcmGum::CreateResource(Disk Drive C:\mountpoint\KDrive\,8836dfef-fa51-419d-960f-75965fed6cfd,SQL Server (MSSQLSERVER))
00000304.00000554::2016/03/10-04:59:51.678 ERR   [RES] Physical Disk <Disk Drive C:\mountpoint\KDrive\>: Open: Unable to get disk identifier. Error: 5023.
00000928.00000dc8::2016/03/10-04:59:51.678 INFO  [RCM] HandleMonitorReply: OPENRESOURCE for 'Disk Drive C:\mountpoint\KDrive\', gen(0) result 0.
00000304.00000554::2016/03/10-05:00:12.208 ERR   [RHS] Error 13 from ResourceControl for resource Disk Drive C:\mountpoint\KDrive\.
00000928.00000ce8::2016/03/10-05:00:12.208 WARN  [RCM] ResourceControl(SET_PRIVATE_PROPERTIES) to Disk Drive C:\mountpoint\KDrive\ returned 13

I tried with various syntax for the Disk Drive path (with single \ and double \\) but nothing works. If I execute the same code with path like K:\ it works fine.

Code snippet:

try
 {
  // Create the resource.  The resource name is "Disk Drive @:"
  // where @ is the drive letter of a disk partition.
  bstr_t bstr;
  UTIL_Utf8ToWideChar (szDiskPath.data(), bstr);
  int length = bstr.length ();
  lpstrDiskPathW = new WCHAR[length + 1];
  wcsncpy (lpstrDiskPathW, (const wchar_t*)bstr, length);
  lpstrDiskPathW[length] = L'\0';

  String strResName = "Disk Drive " + szDiskPath;
  UTIL_Utf8ToWideChar(strResName.data(), bstr);
  length = bstr.length ();
  lpstrResourceNameW = new WCHAR[length + 1];
  wcsncpy (lpstrResourceNameW, (const wchar_t*)bstr, length);
  lpstrResourceNameW[length] = L'\0';

  hResource = m_funcCreateClusterResource(hClusterGroup,
   (LPCWSTR)lpstrResourceNameW,
   L"Physical Disk",
   0);

  if( hResource == NULL )
  {
   m_log.error("CreateDiskResource: failed to create disk resource %s", strResName);
   throw -1;
  }
  else
  {
   m_log.info("CreateDiskResource: created disk resource %s", strResName);
  }

  // Set the diskpath private property
  // Begin property list used to set the DiskPath private property.
  WCHAR szPropName[] = CLUSREG_NAME_PHYSDISK_DISKPATH;

  typedef struct _DiskPathControl
  {
   DWORD dwPropCount;
   CLUSPROP_PROPERTY_NAME_DECLARE(PropName,sizeof(szPropName)/sizeof(WCHAR));
   CLUSPROP_SZ_DECLARE(DiskPathValue, sizeof(lpstrDiskPathW)/sizeof(WCHAR));
   CLUSPROP_SYNTAX Endmark;
  } DiskPathControl;

  DiskPathControl DPC;

  //  Property Count
  DPC.dwPropCount = 1;

  //  Property Name
  DPC.PropName.Syntax.dw  = CLUSPROP_SYNTAX_NAME;
  DPC.PropName.cbLength   = sizeof( szPropName );
  wcsncpy (DPC.PropName.sz, (const wchar_t*)szPropName, DPC.PropName.cbLength);

  //  Property Value
  DPC.DiskPathValue.Syntax.dw = CLUSPROP_SYNTAX_LIST_VALUE_SZ;
  DPC.DiskPathValue.cbLength  = sizeof( lpstrDiskPathW );
  wcsncpy (DPC.DiskPathValue.sz, (const wchar_t*)lpstrDiskPathW, DPC.DiskPathValue.cbLength);

  //  Endmark
  DPC.Endmark.dw = CLUSPROP_SYNTAX_ENDMARK;

  DWORD cbSize = sizeof( DiskPathControl );

  //  End property list creation

  // Set the diskpath private property
  dwRC = m_funcClusterResourceControl( hResource,
   NULL,
   CLUSCTL_RESOURCE_SET_PRIVATE_PROPERTIES,
   ( void* ) &DPC,
   cbSize,
   NULL,
   0,
   NULL );

  if( dwRC != ERROR_SUCCESS )
  {
   String err(dwRC);
   m_log.error("AA_ClusterBase:: CreateDiskResource: failed to set the DiskPath property, error %s", err);
   m_funcDeleteClusterResource( hResource );
   m_funcCloseClusterResource( hResource );
   hResource = NULL;
   throw -1;
  }
 }
 catch (...)
 {
 }

Is there a know limitation with the Cluster API for not supporting disks mounted on mountpoints?

BTW this works fine:

C:\>cluster res “Disk W:\Mount” /priv DiskPath=”W:\Mount”

Thanks,

Aditya

Guest VM simultaneous failover

$
0
0

Hi,

It is a requirement within our environment for certain guest VM's to always be located on the same node of a cluster as each other, so if one is migrated off, the other moves with it. Essentially they need to be "paired".

Can someone please advise on how I can do this?

Regards

Leon


Monitoring Server (Opmanager) shows clear/online status for one of the MS SQL Server 2012 on Windows 2012 R2 virtual machines

$
0
0

Environment:-

  • Opmanger monitoring server across multiple WAN connections installed on subnet 10.250.1.xx
  • 3 x MS SQL Server 2012 Enterprise edition installed on two MS Windows 2012 R2 virtual machines on clustered environment
  • 1 x MS SQL Server is on 10.15.16.x subnet
  • 2 x MS SQL Server is on 10.15.18.xx subnet

No Issue:-

  • No issue from monitoring application to SQL Server on 10.15.16.xx subnet the status shows "Online"
  • No issue from monitoring application to any other server on subnet 10.15.18.xx submnet

Issue:-

  • One on the SQL server on 10.15.18.xx subnet
  • Monitoring application shows "Online" status for one of SQL server. If I restart SQL server with "critical" status updates to "online" after restart and other SQL server with "online" status changes to "critical"
  • Basically one of the SQL Server on cluster environment  subnet 10.15.18.xx is always showing on "Critical" status by monitoring application server
  • I can ping from server status "online" both direction
  • I cannot ping from server status "critical" both direction
  • I can trace from server status "online" both direction
  • I cannot trace from server status "critical" both direction
  • No errors into event logs

I have done my troubleshooting and also posted on opmanger forums with no luck on resolution

I believe it is more cluster issue when service restarts

Any idea on resolution please?


Muhammad Mehdi

Cannot add a second node!

$
0
0

Hi 

I have a two node hyper-v cluster that i am running. One of my node failed (crashed) , I reinstalled my server, setup  as it was before the crash occured. I tried to re-add it bu it keeps falling with the following error messages:

On the servers i tried to add: 

[QUORUM] An attempt to form cluster failed due to insufficient quorum votes. Try starting additional cluster node(s) with current vote or as a last resort use Force Quorum option to start the cluster. Look below for quorum information,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR   [QUORUM] To achieve quorum cluster needs at least 2 of quorum votes. There is only 1 quorum votes running
00000fcc.00000204::2016/09/07-16:20:48.958 ERR   [QUORUM] List of running node(s) attempting to form cluster: VM01,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR   [QUORUM] List of running node(s) with current vote: VM01,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR   [QUORUM] Attempt to start some or all of the following down node(s) that have current vote: EXITINGVM, EXISTINGVM0,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR   join/form timeout (status = 258)

Any help from you will be appreciated and thanks in advance.

Team up iScsi networks in windows 2012 R2 failover cluster

$
0
0

Hi Team,

I have a fail over cluster with 3 nodes. We are using  iSCSI initiators to connect SAN storage. Each node has two iSCSI network adapters.

Need your suggestion is it feasible or advisable to team up both iSCSI networks, currently they are not teamed up. 

And also iSCSI network cluster use set as NONE, is that right?

Regards,

KR

Assigning Permission to Cluster

$
0
0

Hi Team,

We have installed Windows Failover Server with an account that has domain admin rights.

Once failover cluster is created, we are handing over the Cluster access to SQL Admin. We have an SQL installation user account created as 'SQL_admin'. SQL Admin logs into the node using the SQL_admin account but not able to connect with the Cluster.

SQL_admin account is added to 'Local Administrators' Group on each node those are part of cluster. I have logged into one of the cluster node with domain admin and try to add 'SQL_admin' account to 'Cluster permission' but not able to so with an error (screen shot attach)

We are getting attach error and need some pointers to provide full access to SQL_admin account so that he can start installing SQL instances.

Any help would be highly appreciated. Thanks

Regards,

MPIO and moving the tempdb to a different disk

$
0
0

We need to move the SQL server temp database of our SQL server cluster to new partition on the same disk. I've found some instructions that discuss moving the temp database, however I'm trying to find out whether there will be any additional issues as a result of the change. 

1. The disk resides on SAN storage, are there any additional steps required regarding MPIO?

2. Are there any steps that need to be carried out on the secondary (failover) cluster node?

(current steps identified)

USE master
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = tempdev, FILENAME = 'd:\datatempdb.mdf')
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = templog, FILENAME = 'e:\datatemplog.ldf')
GO

stop the SQL server instance

move files to the new location

restart the instance


(Environment Details)

OS: Windows Server 2008 R2

Platform: SQL Server 2008 R2

Clustering: Windows Failover Cluster Manager

Viewing all 2306 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>