Quantcast
Channel: VMware Communities : All Content - Site Recovery Manager
Viewing all 2572 articles
Browse latest View live

SRM failover - customize IP (srm5.0.2)

$
0
0

All,

 

I have been testing failovers before it goes into production.

 

I have been having issues with the customization of the IPs.

We have linux VMs

 

I noticed that the default value for cusomization is 600 secs. If I am reviewing the proper config, Mine is set to 600 sec.

 

I am receiving the error"

 

Error Connecting to VM vm-135: Operation timed out after '20.000000' seconds

 

 

any help would be great!!!

 

thanks


SRM 5.1.1 with Windows 2012

$
0
0

I am installing SRM 5.1.1 on Windows 2012.

UAC is off, I am a local Administrator and I still run the .exe as Administrator.

I get through the install wizard and right at the end, I get the Error Message 1 (attached).

I looked through the logs, nothing seems suspicious or really any error messages.

I took out the space in my destination install folder for "Program Files" and made it "ProgramFiles" and ended up with Error Message 2 (attached). This tells me that the install doesn't like spaces in the destination folder. I tried to put in quotes for the destination folder and that doesn't work.

Anyone else running across this? For now, I have to get this installed so I will take out all spaces but if anyone gets an actual fix for this, it would be highly appreciated to post it here.

 

Thanks!

New vCenter Server

$
0
0

Pretty sure the answer is going to be recreate protection groups and recovery plans but will ask anyways.

 

We are moving to a new install of vCenter and will be pointing the current SRM environment over to the new vCenter.  Is there anyway of getting around recreating all the protection groups and recovery plans?  The new vCenter will look exactly like the old one as far as resource pools and things go, VM's are not changing storage locations, but will be vmotioned around in order to upgrade the hosts.

 

Thanks

MSCS - Microsoft NLB cluster

$
0
0

Hi,

We have 4 node NLB cluster in Production site and we are planning protect it with SRM 5.1.1.

SRM 5.1.1 supports 2 Microsoft Cluster Server (MSCS) nodes.  vSphere 5.1.x supports up to 5 MSCS nodes. SRM 5.1.x supports 2 MSCS nodes.

 

Is it possible to have 4 node NLB in production and protect only two nodes with SRM? as we just need two nodes for DR.

SRM 5.1.1 strange behavior

$
0
0

Hi all,

 

I installed a DR infrastructure as follow :

 

protected site:

1 VM vCenter 5.1 with SQL Express 2008 R2 SP1

HP SRA on the vCenter server for EVA storage

1 VM SRM 5.1.1 with SQL Express 2008 R2 SP1

 

recovery site exactly the same:

1 VM vCenter 5.1 with SQL Express 2008 R2 SP1

HP SRA on the vCenter server for EVA storage

1 VM SRM 5.1.1 with SQL Express 2008 R2 SP1

 

SRA is configured.

I downloaded the plugin, configured SRM (mappings, recovery groups, recovery plan), tested the recovery plan and everything  worked...

 

BUT

 

The day after, when connecting to the vCenter, SRM was not available in Home ==> Inventory ==> Solutions and Applications...Plugin was not installed anymore but in the available plugins

When I tried to reinstall the plugin, I had an error "SRM plugin is already installed"...

 

And here is the tricky part. I uninstalled the plugin on each vCenter rebooted it and tried to install it again and got the same error "SRM plugin is already installed"...

I uninstalled and installed again the 2 SRM servers. Same error...

 

I don't know what to do next...If someone has an idea, it will be very appreciated

Clarification SRM permissions

$
0
0

Hi

 

I am having a go at installing SRM. I have added the required SRM roles to the permissions tab under the site recovery plugin for both protected and recovery sign.
There is also an administrator role specified as well since it is required.
Do I also need to assign my SRM roles to the vCenter objects? From what I understand the SRM actually uses the account used to make site connection so that to me says I would not need to specify them again.

SRM5.1 / Storwize V7000 6.4.1.4 Command not supported : [discoverArrays]

$
0
0

Hi All

 

Production Site:

IBM Storwize V7000 and FW level: 6.4.1.4

SRM ver: VMware-srm-5.1.1-1082082

SRA : IBMSVCSRA_v2.2.0.130422

IP: 192.168.70.50

vCenter 5.1 U1

 

DR Site:

IBM Storwize V7000 and FW level: 6.4.1.4

SRM ver: VMware-srm-5.1.1-1082082

SRA : IBMSVCSRA_v2.2.0.130422

IP: 192.168.70.60

vCenter 5.1 U1

 

When I try to add array then this error occurred

 

SRA command 'discoverArrays' failed. Invalid Array ID.

Refer to IBM SAN Volume Controller troubleshooting

 

And SRM log is attached.

 

But I see this part of the logs.

 

Sorry for bad English.

 

-->   <Name>discoverArrays</Name>

-->   <OutputFile>C:\Windows\TEMP\vmware-SYSTEM\sra-output-12-119</OutputFile>

--> <StatusFile>C:\Windows\TEMP\vmware-SYSTEM\sra-status-13-225</StatusFile>

-->   <LogLevel>verbose</LogLevel>

-->   <LogDirectory>C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\IBMSVC</LogDirectory>

-->   <Connections>

-->     <Connection id="primary">

-->       <Addresses>

--> <Address id="spA">192.168.70.50:5989</Address>

--> <Address id="spB">192.168.70.60:5989</Address>

-->       </Addresses>

--> <Username>***</Username>

--> <Password>***</Password>

-->     </Connection>

-->   </Connections>

--> </Command>

2013-06-20T14:34:33.395+03:00 [02020 verbose 'SysCommandWin32' opID=258F28AC-0000001A] Starting process: "C:\Program Files\VMware\VMware vCenter Site Recovery Manager\external\perl-5.14.2\bin\perl.exe" "C:/Program Files/VMware/VMware vCenter Site Recovery Manager/storage/sra/IBMSVC/command.pl"

2013-06-20T14:34:33.395+03:00 [02020 verbose 'SraCommand' opID=258F28AC-0000001A] Resetting SRA command timeout to '300' seconds in the future

2013-06-20T14:34:33.395+03:00 [02020 verbose 'SraCommand' opID=258F28AC-0000001A] Listening for updates to file 'C:\Windows\TEMP\vmware-SYSTEM\sra-status-13-225'

2013-06-20T14:34:33.395+03:00 [02020 verbose 'PropertyProvider' opID=258F28AC-0000001A] RecordOp ASSIGN: info.progress, dr.storage.StorageManager.createArrayManager1

2013-06-20T14:34:33.441+03:00 [02444 verbose 'DrTask' opID=258F28AC-0000001A] Created VC task 'com.vmware.vcDr.dr.storage.StorageManager.createArrayManager:task-3901'

2013-06-20T14:34:34.065+03:00 [01496 info 'SraCommand' opID=258F28AC-0000001A] discoverArrays's stdout:

--> discoverArrays

--> Per Haz 20 14:34:33.909:[Info]Executing Module[ Constant ]Inatall Path: [C:\Program Files\VMware\VMware vCenter Site Recovery Manager\storage\sra\IBMSVC\]

--> Per Haz 20 14:34:33.909:[Info]Executing Module[ Constant ]SRA Name: [IBM System Storage SAN Volume Controller Adapter for VMware vCenter Site Recovery Manager] uuid: [38295F72-F7D0-4A0F-B8D4-FF9821AB2675] version: [2.2.0.130422]

--> Per Haz 20 14:34:34.19:[info]Executing Module[ SRARunner ]commandName file: [discoverArrays]

--> Per Haz 20 14:34:34.19:[verbose]Executing Module[ SRARunner ]logLevel: [verbose]

--> Per Haz 20 14:34:34.34:[verbose]Executing Module[ SRARunner ]outout file: [C:\Windows\TEMP\vmware-SYSTEM\sra-output-12-119]

--> Per Haz 20 14:34:34.34:[fatal]Executing Module[ SRARunner ]Command not supported : [discoverArrays]

-->

2013-06-20T14:34:34.065+03:00 [01496 error 'SraCommand' opID=258F28AC-0000001A] discoverArrays's stderr:

--> com.ibm.cstl.hsg.sra.svc.SRAException: Command not supported : [discoverArrays]

-->          at com.ibm.cstl.hsg.sra.svc.SRARunner.execute(SRARunner.java:100)

-->          at com.ibm.cstl.hsg.sra.svc.SRARunner.main(SRARunner.java:173)

-->

2013-06-20T14:34:34.065+03:00 [01496 verbose 'SraCommand' opID=258F28AC-0000001A] Stopped listening for updates to file 'C:\Windows\TEMP\vmware-SYSTEM\sra-status-13-225'

2013-06-20T14:34:34.065+03:00 [01496 verbose 'SraCommand' opID=258F28AC-0000001A] Cancelling SRA command timeout

2013-06-20T14:34:34.065+03:00 [01496 info 'SraCommand' opID=258F28AC-0000001A] discoverArrays exited with exit code 0

2013-06-20T14:34:34.065+03:00 [01496 verbose 'SraCommand' opID=258F28AC-0000001A] discoverArrays responded with:

--> <?xml version="1.0" encoding="UTF-8"?>

--> <Response xmlns="http://www.vmware.com/srm/sra/v2">

-->     <Error code="4"/>

--> </Response>

SRM 5.1-941848/NetApp RDMs - Always getting errors

$
0
0

This is a bit frustrating. Since day 1 of deployment, SRM has NEVER been consistent in behavior. The Engineer that originally deployed the install stated that in his experience it's just not something that always runs from beginning to end without some sort of "issue". Most of the time we can just push through a failure, but at other times you have to spend days, if not weeks troubleshooting through issues. Being bounced around from the storage vendor to VMware with each side claiming the issue is not theirs.

 

Within 5 months I've reinstalled SRM about 6 times now. I can do it in my sleep at this point.

 

Calling VMware and having to wait 1-2 days for a response, and then another week or so for another after logs are collected is not ideal anymore, so I'm hoping I can figure out this issue here within the community.

 

Sorry for the rant, I just feel like at this point we should have completed our Production failover test and still find myself battling what seems like the same old errors.

 

So, on with the error. We are using the latest SRM build (5.1.0 -941848) and the latest SRA adapter from NetApp. When attempting a test recovery with a VMFS datastore with guests that have RDMs I will get the following errors:

 

Error - Failed to sync data on replica device '/vol/volumename/qt1/lun1'. Device found is neither of the SAN type nor of the NAS type. Ensure that the device exists on the storage array and is of type NAS or SAN.

 

I found a few articles that walk you through disabling fastpath on the filers and also changing the storage value from the default 5 connections down to 2. The second option fixed this issue last time I was presented with it. When it happened again VMware support said the volume in question didn't have any room for snapshots as they were full. i was advised to clear some up and then retry the test. Here I have done all of the above and yet I still see the issues.

 

We have a production failover test scheduled for the weekend after next. At this point I hate to push this back as it wouldn't be the first time we did.

 

 

 

And FYI, those of you that believe a TEST is a substancial "test".. it's not. Doing a full blown Recovery compared to just a TEST is night and day. In our exprience we've run many succesful tests to only have them blow up when doing a RECOVERY.


concurrent recoveries

VMware SRM Reprotect Fails with Peer array ID provided in the SRM input is incorrect

$
0
0

I have the following setup

 

Site-A

VMware vCenter 5.1 U1

VMware SRM 5.1.1

NetApp SRA 2.0.1P2

FAS3140C - Data ONTAP 8.1.2P4

 

Site-B

VMware vCenter 5.1 U1

VMware SRM 5.1.1

NetApp SRA 2.0.1P2

FAS3140C - Data ONTAP 8.1.2P4

 

I have created a basic Protection group at Site-A containing a single VM with a single vmdk hard disk on a NFS volume.

The NFS volume is snapmirrored to Site-B

I can perform a planned migration to Site-B, reprotect and then another planned migration back to Site-A but then when I attempt to reprotect again so that I am ready for another recovery to Site-B the reprotect fails on the first step Configure Storage to Reverse Direction with "Error - Failed to reverse replication for failed over devices. SRA command 'prepareReverseReplication' failed. Peer array ID provided in the SRM input is incorrect Check the SRM logs to verify correct peer array ID."

 

I cannot see anything on the Filer logs or in the SRM logs to indicate what the issue is.

 

This happens for every Protection Group I create so it is not isolated to just this one volume.  I have also tried with iSCSI VMFS volumes and get exactly the same results.

 

If I create a Protection Group at Site-B I can recover to Site-A and cannot reprotect it to fail it back to Site-B.

 

Initially I though the issue I was seeing is that I could do a failback but couldn't perform a second reprotect because the SnapMirrors were left in the wrong state but now I can see that the issue is that I cannot perform a reprotect from Site-A to Site-B.

 

I have completely un-installed SRM at both locations, removed the SRM database at both locations and started again but still get the same issue.

 

I've actually got IBM N series N6040 controllers and am using the IBM branded Data ONTAP and SRA.  I have a call open with VMware and IBM but not getting very far.

 

Has anyone seen this issue before and got a solution?

SRM and NLB

$
0
0

Can anyone tell me that SRM 5.1 support the IP Injection of Microsoft NLB. Because I am the issue while injecting multiple ip's on single NIC.

 

Primary Site

vCenter 5.0

SRM 5.1.1

EXSi 4.1

 

DR Site

vCenter 5.0

SRM 5.1.1

EXSi 5.0

SRM 5.1 configuration

$
0
0

Hi all, I am currently trying to upgrade my SRM 4.1 environment to SRM 5.1. My question is the following.

 

Can SRM 5.1 work on a mix environment? I was able to work with a mix environment of ESX 3.5 and ESX 4.0 and VC Server 2.5 and VCenter 4.0 with no issues. But I've never done a 4.1 and 5.1.

 

I currently have Two environment with 4.1 ESX hosts and 4.1 VC Server. What I want to do is to upgrade the recovery site VC Server first and run with VC Server 5.1 (SRM 5.1) and ESX hosts 4.1. On the protected site I will leave it as is at VC Server 4.1 and ESX hosts 4.1 until I'm able to do the upgrade. Does the SSO feature or any other features in VC Server 5.1 interfere with any of the components of VC server 4.1?

 

Please let me know if you need anymore information. Any help will be greatly appreciated. I will continue to research but a push in the right direction will be greatly appreciated.

 

Thanks

SRM 5.1/V7000 with SRA 2.1.0.x fails during 'testFailover'.

$
0
0

Current scenario:

SRM and vCenter versions 5.1 (in both sites)

ESXi versions 5.1 (all hosts)

Storage: IBM Storwize V7000 with 7.1.0.1 code level (both sites)

SRA Configuration: 'non pre-configured' mode

Test MDisk: 0

 

Problem description:

 

When I run the recovery test, it synchronizes both storages with success, and create at the backup storage a writable snapshot too (FlashCopy) and it stops with a error:

 

Error - Failed to recover datastore 'LUN01_RAID5_VMs'. VMFS volume residing on recovered devices '"0"' cannot be found. Recovered device '0' not found after HBA rescan.

 

I noticed that the SRA creates the FlashCopy on the backup storage but it doesn't map the snapshot to backup site hosts, causing the problem.

 

Any suggestion and/or ideas? Thank you!

Unable to add array manager.....Timed out (300 seconds) while waiting for SRA to complete "discoveryArrays" command.

$
0
0

Hi,

 

For my product qualification with vSRM, I have installed,

1. VMware-srm-5.1.1-1082082.exe on two servers

2. vCenter Server from VMware-VIMSetup-all-5.1.0-880471.iso (on two servers)

3. Two ESXi 5.1.0.799733 servers

 

On the SRM server, have installed,

1. EMC_VNX_SRA_v5.0.1_64bit.exe

2. EMC_Mirrorview_Enabler_for_VNX_SRA_v5.0.22_64bit.exe

3. EMC_VNX_Replicator_Enabler_for_VNX_SRA_v5.0.15_64bit.exe

 

All these are running on OS Windows 2008 R2 SP1 Enterprise Edition.

CX4-240 and CX4-480 are the arrays being used.

 

Now, when tried to add array through array manager from SRM, an error message is observed as
"Unable to add array manager.....Timed out (300 seconds) while waiting for SRA to complete "discoveryArrays" command."

Though, time out period was increased to 1500, same error is being observed.

 

This error is now stopping me to proceed further wrt qualification test. Kindly assist me in addressing this issue at the earliest.

 

Regards,

Srini

Extn: 785-5862

Phone: +91-80-67375862

Email: srinidhi.venkobarao@emc.com

DR Solutions

$
0
0


Hi,

What are the DR &  Back up solutions available with VMWARE for critical servers and how will it work  .

 

 

 

Regards,

Ragesh


Best location for vcenter VM and placeholder datastores

$
0
0

Hello,

 

I have an SRM installation with 2 sites.  The sites are seperated by metro ethernet layer 2 trunk between sites.

 

Site A

1 EqualLogic PS6100

2 ESX servers configured as a cluster

There is a single datastore volume on the SAN holding all VMs in this cluster

There is one instance of vCenter at this site that resides on the same datastore as the other VMs

There are also a few other volumes on the SAN used for RDM disks within the VMs

I have also created a small 1GB volume as a VMFS datastore and have it mapped within SRM to use as a placeholder datastore

     What are your thoughts on this?  Is this the best way?  Could I use local storage on the ESX servers?  Or can my main VM datastore be used for placeholders?

SRM is installed on the same VM as vCenter

SRM is using SQL Express on the local vCenter VM

vCenter is also using SQL Express

 

Site B

1 EqualLogic PS4100

1 ESX server, so no clustering

There is a single datastore volume on the SAN holding all VMs running on the ESX server

Per SRM requirements, there is a seperate vCenter instance running as a VM, and it is not in linked mode.  Should it be? / Does it need to be?

The number of volumes on the SAN and their purposes are the same as in Site A, only difference is the volume sizes are smaller because less VMs running at this site.

Replication has been set up and verified working bi-directional.

     We are using array-based replication

 

Ok, so here is the problem... When I do a planned recovery, the process shuts down all VMs at that site which includes the vCenter/SRM server.  I assume this happens because it is on the same datastore as my other VMs?

 

If I run recovery in disaster mode, then vCenter is moved to the other site and powered on.  Because sites are layer 2 and on same vlan, SRM shows the sites are connected due to the 2 SRM instances can see each other eventhough they are now running at the same site.  This does not seem correct to me.

 

Can anyone suggest a better configuration?

 

Thanks,

Brian

Recovery of oracle servers at same time

$
0
0

I have two oracle servers which is used by our in house application.

We have SRM 5.1.1 with vSphere replicator enabled for VMs.

 

In order for application to work, during test recovery we need to recover both oracle servers at same time.

If I run recovery plan at 5:30 with 15 min RPO, we want both oracle database server to come up as they were

at 5:15. If one server comes up as it was at 5:10 and other server comes up as it was 5:15, then application

will not work. How we configure this in SRM 5.1.1

vSphere Replication Protection Groups query

$
0
0

Does this apply to vSphere Replication Protection Groups?
-- "When you create a protection group for array-based replication, you specify array information and then SRM computes the set of virtual machines into a datastore group. Datastore groups contain all the files of the protected virtual machines."--

 

Two recovery plan for same datastore?
In the recovery site i have datastore named "VMFS1" I have 5 VMS. All VMs are replicated via Vsphere replicator 5.1. Is it possible to have one recovery plan with VMs "VM1 and VM2" and another recovery plan with VMs "VM3,VM4 and VM5" ?  and run both recovery plan in test mode at same time ? I am not sure how SRM will handle this because
recovery Plan A would have already created snap shot of datastore VMFS1 and recovery plan B will not be able to create snapshot of datastore VMFS1 at same time. Any ideas?

 

Does this apply to  vSphere Replication?

ONE DS GROUP = ONE PROTECTION GROUP

once a Datastore Group is mapped to a protection group it has now been benched, meaning you cant use that DS within any other PG. It’s cool when you don’t have only one Datastore Group that in essence encompasses all of your replicated Datastores. And this is what can happen if you have multiple VM’s spread across multiple replicated Datastores. ONE DS GROUP = ONE PROTECTION GROUP…Failover is now all or none

Site Recovery Manager getting disconnected quite often and taking long time to get connected?

$
0
0

Namaste to all,

 

Whenever I am connecting to SRM through vSphere client, it is taking long time to connect and I am getting an error message (please see the attached image) and after 4 or 5 attempts I am able to connect. Later, it is getting disconnected automatically and once again it takes at least 10 minutes and multiple attempts to connect to SRM. I am using 5.1 version. Kindly let me know why is this happening? Thanks in Advance

What is happening behind the scenes, when you run a Test Recovery Plan, Planned Migration, Unplanned Failover?

$
0
0

It has always bugging me to know what really happens when you run a Test recovery plan, a Planned Migration and an Unplanned Failover.

 

Appreciate if someone could shed me some light on what does really happend when you start the above mentioned processes.

 

I presume all works the same...

 

Thanks,

Rushdi

Viewing all 2572 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>