Quantcast
Channel: VMware Communities : All Content - Site Recovery Manager
Viewing all 2572 articles
Browse latest View live

Real life disaster recovery

$
0
0

Hi,

 

I've worked with SRM a bit and performed a few DR with test/non-production VMs but I wonder if you ever used SRM in real life DR situation?

 

I'm just curious if it worked as well as during testing and if there may be some lessons to learn.

 

 

Cheers!


Protect a VM with SRM by excluding some of the VMDKs?

$
0
0

Can someone help me with this?

 

Can we Protect a VM with SRM by excluding some of the VMDKs (Possibly RDMs)?  If yes, can I have high level steps please?

Site Recovery manager does not validate xml response for QuerySyncStatus Command appropriately and crashes the Site Recovery Manager Service if a warning tag is sent causing a core dump

$
0
0

2014-07-03T15:59:35.369-07:00 [03372 verbose 'SraCommand' opID=71C8A340-0000007C] querySyncStatus responded with:

--> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

--> <Response xmlns="http://www.vmware.com/srm/sra/v2">

-->     <QuerySyncStatusResults>

-->         <SourceDevices>

-->             <SourceDevice id="14">

-->                 <DeviceSync id="sync-14" status="inProgress">

-->                     <Progress>10</Progress>

-->                     <RemainingTimeEstimate>71</RemainingTimeEstimate>

-->                 </DeviceSync>

-->                 <Warnings>

-->                     <Warning code="500"/>

-->                 </Warnings>

-->             </SourceDevice>

-->         </SourceDevices>

-->     </QuerySyncStatusResults>

--> </Response>

-->

2014-07-03T15:59:35.370-07:00 [03372 verbose 'PropertyProvider' opID=71C8A340-0000007C] RecordOp ASSIGN: info.progress, syncOnce26

2014-07-03T15:59:35.381-07:00 [03372 verbose 'Storage' opID=71C8A340-0000007C] XML validation succeeded

2014-07-03T15:59:35.382-07:00 [03372 verbose 'Storage' opID=71C8A340-0000007C] Device '14' is in 'another sync in-progress' state

2014-07-03T15:59:35.405-07:00 [03372 info 'Default' opID=71C8A340-0000007C] CoreDump: Writing minidump

2014-07-03T15:59:35.601-07:00 [03372 panic 'Default' opID=71C8A340-0000007C]

-->

--> Panic: Not Reached: @ d:/build/ob/bora-1315893/srm/src/storage/operations/executors/syncTracker.cpp:202

--> Backtrace:

--> backtrace[00] rip 000007fef046a04a

--> backtrace[01] rip 000007fef032f51f

--> backtrace[02] rip 000007fef03306be

--> backtrace[03] rip 000007fef048172c

--> backtrace[04] rip 000007fef048188c

--> backtrace[05] rip 000007fef0320a5f

--> backtrace[06] rip 0000000009c9d9cc

--> backtrace[07] rip 0000000009ca0551

--> backtrace[08] rip 0000000009c7313e

--> backtrace[09] rip 000000000a2983e4

--> backtrace[10] rip 000000000a282496

--> backtrace[11] rip 000000000a282d63

--> backtrace[12] rip 000000000a27d1fa

--> backtrace[13] rip 000000000a27f74c

--> backtrace[14] rip 000000000a27e23e

--> backtrace[15] rip 000000000a285225

--> backtrace[16] rip 000007fef040a9e2

--> backtrace[17] rip 000007fef0410641

--> backtrace[18] rip 000007fef0411387

--> backtrace[19] rip 000007fef0412478

--> backtrace[20] rip 000007fef040a07a

--> backtrace[21] rip 000007fef040cf89

--> backtrace[22] rip 000007fef040c383

--> backtrace[23] rip 000007fef040d8e3

--> backtrace[24] rip 000007fef040db63

--> backtrace[25] rip 000007fef047c9aa

--> backtrace[26] rip 0000000074282fdf

--> backtrace[27] rip 0000000074283080

--> backtrace[28] rip 0000000076aaf56d

--> backtrace[29] rip 0000000076ce3281

Equallogic Synchronous Replications firmware 7.0.X with SRM 5.5.1

$
0
0

HI,

 

How can I have a synchronous replication using DELL Equallogic PS Series Group with SRM 5.5?

 

SRM need two PS Series Groups to configure de Prod Site and DR Site into SRM Manager Plug-in, and DELL Equallogic only support SyncRep if all PS members are into the same PS Group.

 

How can I achive this?

 

Regards,

Danilo.

Failed to validate certificate during Site Recover Manager installation

$
0
0

Hi,

 

I'm having an issue while deploying VMware vCenter SRM with the certificate.

Our vCenters are both 5.5 appliances and have a trusted certificate from our CA's.  The certificate replacement procedure was used for that and it has been working fine.

 

Now we have deployed 2 windows 2012 r2 servers, to install vCenter Site Recovery Manager.

But we keep on getting the following error

 

Capture.JPG

 

I have tried creating the CSR with certreq and even with openssl, but the problem remains.

The openssl config was like this

 

[ req ]

default_bits = 2048

default_keyfile = rui.key

distinguished_name = req_distinguished_name

#Don't encrypt the key

encrypt_key = no

prompt = no

string_mask = nombstr

[ req_distinguished_name ]

countryName = BE

stateOrProvinceName = Vlaams-Brabant

localityName = Diegem

0.organizationName = DSI

organizationalUnitName = NOC

emailAddress = xxxx@xxxx

commonName = SRM

extendedKeyUsage = serverAuth, clientAuth

subjectAltName = DNS: srmservername01.fqdn

 

 

Any idea? Been searching for a solution for days.

And I'm not really into moving back to Windows based vcenter servers.

 

Thanks

SRM limitations

$
0
0

I am looking at SRM with vSphere Replication as a possible technology to provide DRaaS for clients.  It is my understanding that SRM (using vSphere Replication) has the following limitations:

 

1. 500 VMs can be replicated by vSphere Replication.  You can have numerous replication appliances per vCenter, but the total number of protected VMs can never exceed 500? (from here: VMware KB: Operational Limits for SRM and vSphere Replication )

 

2. 10 sites per vCenter/SRM pairing, meaning once you have 10 customers, you would need to add additional support.  I am unclear on how exactly you would extend this beyond the 10 sites however.  Would you have add additional SRM/vCenter pairs?

 

3. Is the 500 VMs limitation spread across the 10 site limitation?  Ex: 50 VMs/site if you had 10 sites, or 100 VMs/site if you had 5 sites.


Are these limitations correct?  It appears that this software is not very well positioned for DRaaS offerings for service providers if these limitations are correct.  Maybe I am missing some potentially easy solutions?

VR Licensing

$
0
0

To use the vSphere Replication component of SRM, are there any licensing issues?  I know if you use the standalone version that ships with vSphere 5.1 & 5.5 you can replicate an unlimited number of VM's (up to the 1,000 VM limit) without additional licenses provided you have a Standard, Enterprise or Enterprise Plus license.

SRM 5.5.1 Reprotect failed

$
0
0

I have my VMware infrastructure in place and working fine. I recently setup SRM 5.5.1 and have my vCenter servers linked. We have 2 EqualLogic arrays. All licensing is in place. Replication is working.

I have one LUN with one VM that I am testing with. I am using array-based replication and I have the SRA installed and configured. I can successfully "Test" a recovery plan and "Recovery" a recovery plan. I cannot "Reprotect" a recovery plan. I have tried making changes to the config on everything I can think of, I have removed and reinstalled a bunch of things but I keep getting the same failed result at the same place (step 1). The error I get is this:

 

 

1. Configure Storage to Reverse DirectionError - Failed to reverse replication for device '/vol/LUN2_SRM2_vol/LUN1_SRM2'. SRA command 'reverseReplication' failed for device '/vol/LUN1_SRM2_vol/LUN1_SRM2'. SnapMirror resynchronization failed Ensure that the devices are configured correctly. Check the log files for more details. 2014-06-26 09:49:26 (UTC 0)2014-06-26 09:50:28 (UTC 0)
1.1. Protection Group Web-ProtectionError - Failed to reverse replication for device '/vol/LUN2_SRM2_vol/LUN1_SRM2'. SRA command 'reverseReplication' failed for device '/vol/LUN1_SRM2_vol/LUN1_SRM2'. SnapMirror resynchronization failed Ensure that the devices are configured correctly. Check the log files for more details. 2014-06-26 09:49:26 (UTC 0)2014-06-26 09:50:28 (UTC 0)
  Device "/vol/LUN2_SRM2_vol/LUN1_SRM2":
Error:"SRA command 'reverseReplication' failed for device '/vol/LUN1_SRM2_vol/LUN1_SRM2'. SnapMirror resynchronization failed Ensure that the devices are configured correctly. Check the log files for more details. " SRA command 'reverseReplication' failed for device '/vol/LUN1_SRM2_vol/LUN1_SRM2'. SnapMirror resynchronization failed Ensure that the devices are configured correctly. Check the log files for more details.
1.1.1. Configure Array-based StorageError - Failed to reverse replication for device '/vol/LUN2_SRM2_vol/LUN1_SRM2'. SRA command 'reverseReplication' failed for device '/vol/LUN1_SRM2_vol/LUN1_SRM2'. SnapMirror resynchronization failed Ensure that the devices are configured correctly. Check the log files for more details.

 

and my storage i use is NetApp Data Ontap 8.1.4 7mode, and error on storage is :

error1.PNG

 

but i still have allow them from NetApp2.

error2.PNG

 

That really make me crazy, i really appreciate about yours comment for help me

 

Thanks


Protection Group hung in "Reprotecting..." state

$
0
0

SRM 5.0 GA on vSphere 5.0 GA

After a Planned Migration of a Protection Group completed successfully, the follow up Reprotect operation failed.  Repeated Reprotect operations with the "Force Cleanup" option fail.  As a result, the Protection Group is left in a state of Reprotecting...  I can manually clean up on the VM and storage side, and I can also delete the associated Recovery Plan. However, the Protection Group remains in a state where I can do nothing with it. Can't delete. Can't unpair or remove SRAs at this point because of the Protection Group dependency.  Does anyone know how to clean up an SRM 5.0 Protection Group when it's stuck in this state without uninstalling and reinstalling the product?

 

Thank you,

Jas

Unable to login to VRM MOB site

$
0
0

Hi All,

 

I'am needing to login to the VRM mob site https://<primaryvrmserver>:8043/mob but I'm having no luck logging in.

 

Anyone come across this and what account worked for you?

 

Thanks.

SRM and IBM DS3524 7.86

$
0
0

Hi all,

I'm getting desperate here... 5.5up1, storage 2x2 IBM DS3524 v7.86, 2 sites . Same SRA. Same free space on the DS3524.

 

SAN 1 is ok except one datastore (can't write snapshot on one datastore) I don't know why it's in french, the system is in english.

"Error - Échec de la création de snapshots de la réplique des périphériques. Échec de la création du snapshot de la réplique du périphérique 60:08:0E:50:00:1B:E9:E2:00:00:03:F1:4F:06:. Périphérique attendu '60:08:0E:50:00:1B:E9:E2:00:00:03:F1:4F:06:' introuvable dans la réponse du SRA 'testFailoverStart'.

The sync is OK.... can't write snapshot on datstore2 but it take some time before the error.


SAN 2 can't write snapshot at all. I don't know why it's in french, the system is in english.

Error - Échec de la création de snapshots de la réplique des périphériques. Erreur interne : std::exception 'class Dr::Xml::XmlValidateException' "Element 'TargetDevices' is not valid for content model: '(TargetDevice,)'".

For all datastores and it's instantaneously.

Both SANs have the same config as far as I know, maybe there's a way to compare them?

 

When I launch the SRM, SAN1 is 'OK', SAN2 doesn't even try the response is can't write snapshot as soon as it launched.

 

Ce message a été modifié par : DaFuq

how to remove cleanup massage in tasks

$
0
0

Hi all,

 

Yesterday I was testing a failover and a subsequent clean up of a VM, in recoverpoint integrated SRM setup. I can see the cleanup has frozen at 70%. Today I failed over and cleaned the same recovery plan and it executed smoothly. What would be a way of removing this incomplete cleanup from the task window - it doesn't allow for cancelling.

 

Please advice.

 

Regards,

 

The Outsider

Mutiple arrays pairing with Single array at DR location

$
0
0

In SRM configuration, we have got an issue where in LUN’s for one the clusters in vCenter server are presented from different array (Same vendor) in Protected site whereas their replication is done on the same array at DR location where replication for all other LUN’s presented from first array is being done.

 

Now in the current configuration, a pair between one array in Protected site and array in DR location is already done. When we are trying to create a pair between second array and the only array in DR location, we get message that array is already added in the SRM.

 

So our question is 'Is it possible to pair 1 DR array with 2 different arrays of same vendor on Protected site’ ?

Provisioning of lun size with SRM

$
0
0

Hi Guys i have question in mind. although this is resolved in the latest version of vsphere 5.x to present more than 2 tb lun.

the setup is i have a Virtual machine with 2tb VMDK for drive D and 1tb VMDK for drive E.

if have extended two (2) x 2tb lun total of 4tb lun in the primary site using  storage replication and srm for automation.

is it automatically adapt the extending of lun that i make in the primary when i perform fail over in the DR site?

New to SRM. How to prevent APDs?

$
0
0

I am new to SRM so forgive any ignorance. I have played with SRM and vSphere replication, but I will be using an SRA with Dell Compellent in production, which we have not built yet.

 

I am curious if you do a planned failover with SRM using an SRA, and say do a planned failover of LUN01. It appears the SRA adapter will remove the mapping(san permissions) from the source cluster. When this happens, how do you prevent the Esx hosts from triggering an APD for the Volume?


command to reboot a VM

$
0
0

I need to reboot a VM started in the recovery site.

So in this VM i've insetreted a POST POWER ON STEP, type COMMAND ON RECOVERED VM and the content is:

shutdown /r /t 5 /f

 

The syntax of the command is correct, I've tested it! But when I run the test of the recovery plan I get:

Error - The command 'shutdown /r /t 5 /f ' returned a non-zero value: 21.

 

Any idea?

SRM and custom certificates

$
0
0

Hi

 

I have 2 vCenter 5.5 servers and both of them have custom CA based certificates (witch I deployed with SSL Certificate Automation Tool).

Also I did certificate requests with same tool and now my one vCenter certificate has OU value "vCenterServer-hostname1" and other vCenter Certicate OU has value "vCenterServer-hostname2"

 

Now I'm trying to setup SRM and I'm reading a document: VMware KB:     Requirements when using trusted certificates with VMware Site Recovery Manager 1.0.x/4.0.x/4.1.x/5.x

 

In this document are described requirements for SRM certificates and two of them are:

 

  • An Organizational Unit (OU) attribute, whose value must be the same as the value of this attribute in the supporting vCenter Server’s certificate.


Am I correct that I must create a certificate for first SRM instance with "OU = vCenterServer-hostname1" parameter and for second instance with "OU = vCenterServer-hostname1" parameter?


  • All OU values for vCenter and SRM certificates must match, to be copacetic with the OUs in the environment.


Now I dont't understand. If my vCenter servers certificates have different OU values is it possible to set SRM certificates at all?

Or must I create a new vCenter servers certificates with same OU values?



Best Regards,


Margo Engel


Cluster Microsoft with RDM and SRM 5.1

$
0
0

Hello,

Main Site, I have 3 hosts 5.1 update1a with attached 16 RDM’s in 3 cluster MSCS. I’ve applied the KB 1016106 fix, and it worked.

When performed with 5.1 SRM failover and failback to rescan the HBAs take long and hangs esxi hosts. At boot takes 3 hours to do.

When checking perennelly-reserved parameter that I am returned to be false.

The SRM 5.1 has a mechanism that does this? Help

 

regards

SRM5.1 IntegrationTests::Recovery - Failing

$
0
0

Folks,

 

I have been running the SRM 5.1 FC certification testing for a SRA plug-in for a couple of days. So far, 83 out 84 tests listed on the workbench have passed without any problems. But, for a couple of days, I have had ERRORS with only one test, the IntegrationTests::Recovery. This test takes about 5 hours to complete, and it has been failing with the following errors listed below:


Failed to rescan HBAs on host 'MY_HOST_NAME'. Unable to communicate with the remote host, since it is disconnected

UTC --> [ERROR 0030 - CONFIG] SRM server task at the primary site vCenter server MY_IP_ADDRESS failed.

2014-07-22 00:07:23 UTC --> - Fault: (dr.storageProvider.fault.DatastoreRecoveryFailed) {

2014-07-22 00:07:23 UTC -->    dynamicType = <unset>,

2014-07-22 00:07:23 UTC -->    faultCause = (dr.storageProvider.fault.RecoveryVmfsVolumeNotFound) {

2014-07-22 00:07:23 UTC -->       dynamicType = <unset>,

2014-07-22 00:07:23 UTC -->       faultCause = (dr.storageProvider.fault.RecoveryDeviceNotFound) {

2014-07-22 00:07:23 UTC -->          dynamicType = <unset>,

2014-07-22 00:07:23 UTC -->          faultCause = (dr.storage.fault.HostRescanFailed) {

2014-07-22 00:07:23 UTC -->             dynamicType = <unset>,

2014-07-22 00:07:23 UTC -->             faultCause = (vmodl.fault.HostNotConnected) {

2014-07-22 00:07:23 UTC -->                dynamicType = <unset>,

2014-07-22 00:07:23 UTC -->                faultCause = (vmodl.MethodFault) null,

2014-07-22 00:07:23 UTC -->                msg = "Unable to communicate with the remote host, since it is disconnected.",

2014-07-22 00:07:23 UTC -->             },

2014-07-22 00:07:23 UTC -->             hostName = "XXXXXXX",

2014-07-22 00:07:23 UTC -->             hostSystem = 'vim.HostSystem:host-28',

2014-07-22 00:07:23 UTC -->            msg = "Failed to rescan HBAs on host 'XXXXXXX'. Unable to communicate with the remote host, since it is disconnected.",

2014-0

 

 

It seems to me that this error is not related to the SRA plug-in, but with the VCenters servers (Primary or Secondary), which for some reason are getting disconnected.

 

Basic SRM tunings (Advanced Setting for timeouts only) are in place and working fine.

I am working with the recommended ESXi, SRM and VC combination for the certification testing, which are 5.1 GA.


Your help is really appreciated.

 

Regards,

Andy

Using array-based replication along with vSphere Replication

$
0
0

Software:  SRM 5.1.2 on vCenter 5.1 U1, adding vSphere Replication 5.1.2 into the mix.

 

We are using array-based replication and want to 'cutover' our array-based Protection Groups to vSphere Replication-based PG's, and, ultimately, stop using array-based replication altogether.

 

I did not install the vSphere Replication components during the original SRM installation so I'm aware that I need to run the SRM installer in repair/maint mode and select for VR.  But in my reading, I came across the following statement on page 17 of the SRM 5.1 Installation and Configuration Guide:  NOTE: Do not attempt to configure vSphere Replication on a virtual machine that resides on a datastore that you replicate by using array-based replication.

 

I would like to know 'why'---what might happen?  Why would this be a bad thing?  Host-based replication should not be 'aware' of array-based replication and vice-versa given they are 2 totally separate replication mechanisms.

 

Ideally, I would get my VR appliances up and running, and, once VR components were installed on SRM, I could then create VR-based PG's for all of the same VM's that are currently array-based PG's, have respective RP's for them, and, once I'm satisfied with the new PG's and RP's on the VR side of things, then just delete all the array-based PG's and RP's, get rid of the SRA's, and array-level replicated devices, and call it a day.

 

We don't use SRM 'extensively' at all here, and given our WAN pipes and other factors, we have decided that, after some satisfactory initial testing of VR (without SRM), we've decided to rid ourselves of the complexity/costs associated with array-based SRM.  I would just rather run them in parallel (array-based PG's/RP's and VR-based PG's/RP's) so we're not 'exposed' while I would otherwise have to sVmotion these same VM's off of the currently array-level replicated datastores that they now sit on, then setup new PG's based on VR.  Thanks for any feedback.

Viewing all 2572 articles
Browse latest View live




Latest Images