utorok 14. decembra 2010

piatok 10. decembra 2010

Thin-on-thin

Pouzivat thin VMDK disk na thin provisioningu na urovni diskoveho pola?
VMware hovori "You can..."

Tu je dalsi prispevok:
So… What’s right - thin provisioning at the VMware layer or the storage layer? The general answer is that is BOTH.

If your array supports thin provisioning, you’ll generally get more efficiency using the array-level thin provisioning in most operational models.

1. If you thick provision at the LUN or filesystem level, there will always be large amounts of unused space until you start to get it highly utilized - unless you start small and keep extending the datastore - which operationally is heavyweight, and general a PITA.
2. when you use thin provisioning techniques at the array level using NFS or VMFS and block storage you always benefit. In vSphere all the default virtual disk types - both Thin and Thick (with the exception of eagerzeroedthick) are “storage thin provisioning friendly” (since they don’t “pre-zero” the files). Deploying from templates and cloning VMs also use Thin and Thick (but not eagerzeroedthick as was the case in prior versions).
3. Thin provisioning also tends to be more efficient the larger the scale of the “thin pool” (i.e. the more oversubscribed objects) - and on an array, this construct (every vendor calls them something slightly different) tends to be broader than a single datastore - and therefore more efficiency factor tends to be higher.

Obviously if your array (or storage team) doesn’t support thin provisioning at the array level – go to town and use Thin at the VMware layer as much as possible.

What if your array DOES support Thin, and you are using it that way - is there a downside to “Thin on Thin”? Not really, and technically it can be the most efficient configuration – but only if you monitor usage. The only risk with “thin on thin” is that you can have an accelerated “out of space condition”.

Zdroj:
Prvy odstavec na strane 11 VMware dokumentu
Performance Study of VMware vStorage Thin Provisioning
www.vmware.com/pdf/vsp_4_thinprov_perf.pdf

Thin on Thin? Where should you do Thin Provisioning – vSphere 4.0 or Array-Level?
http://virtualgeek.typepad.com/virtual_geek/2009/04/thin-on-thin-where-should-you-do-thin-provisioning-vsphere-40-or-array-level.html

Thin disk po alokovani datastore priestoru nema vykonnostnu degradaciu

Zaujimavy postreh: Thin disk format VMDK virtualneho disku po alokovani blokov na podkladovom VMFS datastore nema vykonnostnu degradaciu oproti Thick Zeroed, resp. Thick Eager Zeroed formatom.

Dalsia zaujima vec je nasledovne:
In VMware Infrastructure 3.5, the CLI tools (service console or RCLI) could be used to configure the virtual disk format to any type, but when created via the GUI, certain configurations were the default (with no GUI option to change the type)

* On VMFS datastores, new virtual disks defaulted to Thick (zeroedthick)
* On NFS datastores, new virtual disks defaulted to Thin
* Deploying a VM from a template defaulted to eagerzeroedthick format
* Cloning a VM defaulted to an eagerzeroedthick format

This is why the creation of a new virtual disk has always been very fast, but in VMware Infrastructure 3.x cloning a VM or deploying a VM from a template (even with virtual disks that are nearly empty) took much longer.

Zdroj:
Strana 3, cast Thin Disks vo VMware dokumente Performance Study of VMware vStorage Thin Provisioning
http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf

Thin on Thin? Where should you do Thin Provisioning – vSphere 4.0 or Array-Level?
http://virtualgeek.typepad.com/virtual_geek/2009/04/thin-on-thin-where-should-you-do-thin-provisioning-vsphere-40-or-array-level.html

štvrtok 4. novembra 2010

VAAI - vStorage APIs for Array Integration alias potvrdenie kvality NetApp rieseni

VAAI Plug-in pre VMware vSphere 4.1 spristupnuje SCSI primitivy, ktore pre zvysenie vykonu, umoznuju diskovemu polu vyuzit niektore VMware operacie a to na urovni meta data. V podstate su pridane tri nove SCSI prikazy, ktore musi vediet pouzit dane pole.
Su to tieto:
Full Copy – Xcopy like function to offload work to the array
Write Same -Speeds up zeroing out of blocks or writing repeated content
Atomic Test and Set – Alternate means to locking the entire LUN

The ATS primitive reduces the number of commands required to successfully acquire an on-disk lock.
VAAI calls for an ATS primitive to atomically modify a sector on disk without the use of SCSI reservations and the need to lock out other hosts from concurrent LUN access.

http://blogs.vmware.com/kb/2010/11/how-vstorage-apis-for-array-integration-change-the-way-storage-is-handled.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+VmwareKnowledgebaseBlog+%28Support+Insider%29


Existuje storage riesenie, ktore zintegrovalo spominane funkcionality v roku 2007!
Je to NetApp:
http://blogs.netapp.com/virtualstorageguy/2010/07/vmware-vsphere-vaai-demo-with-netapp.html

Dalsie info:
vStorage APIs for Array Integration FAQ
http://kb.vmware.com/kb/1021976
Hladaj retazec VAAI v dokumente Full Storage Compatibility Guide a najdes polia, ktore dokazu pouzit VAAI primitivy.
http://partnerweb.vmware.com/comp_guide/pdf/vi_san_guide.pdf

Zoznam diskovych poli podporujucich VAAI:
http://v-reality.info/2010/10/list-of-vaai-capable-storage-arrays/

utorok 2. novembra 2010

streda 27. októbra 2010

Vizioncore s konceptom vyuzitia Virtual Appliance

Vizioncore prichadza z novou platformou zalohovania nad ESXi, ktora bude postavena na VA Appliance.

Doporucujem precitat si pekny dokument vysvetlujuci fakt, ze Vizioncore je dobre riesenie zalohovania:

Backup 2.0: How Virtualization Has Launched Next Generation Strategies for Data Protection Technology.
http://vizioncore.com/media/backup20/documents/WhitePaper.pdf

Link na dokument pojednavajuci o novej platforme postavenej na Virtual Appliance.
http://vizioncore.com/sites/default/files/8-13-TBV-VA-Final-US-EH20100813.pdf

Koncept Virtual Appliance bude v ponuke v Q1/11.

streda 20. októbra 2010

Pozor pri konfiguracii VMware View 4.5, nepouzivat deduplikaciu pri Local Mode!

Pozor na pouzivanie deduplikacie pri Local Mode vo VMware View 4.5. Toto nastavenie sposobi v procese prechodu medzi Remote a Local Mode poskodenie virtualizovaneho desktopu.

Zdroj: http://kb.vmware.com/kb/1028195

utorok 5. októbra 2010

pondelok 27. septembra 2010

piatok 17. septembra 2010

Pouzivat vMA ako log server?

Zdroj:
Using vMA As Your ESXi Syslog Server
http://www.simonlong.co.uk/blog/2010/05/28/using-vma-as-your-esxi-syslog-server/

Using vMA as a syslog server to collect ESX and ESXi logs
http://kb.vmware.com/kb/1024122

Je vhodne pouzivat samostatny log server ako napriklad riesenie syslog-ng, http://www.balabit.com/network-security/syslog-ng/

Elastic Sky X

ESX = Elastic Sky X

http://tim-mann.org/gallery/2006sep

Ake je doporucenie pre Transparent page sharing pri virtualizovanych Terminal Services?

vSphere’s ability to overcommit VM memory and memory de-duplication through transparent page sharing (TPS) is highly useful for the consolidation of many VM’s on a single server, especially within Server Hosted Virtual Desktop scenario’s. Nevertheless, one of the older Terminal Server best practices floating around the internet communities was to disable TPS. Project VRC phase 1 showed that disabling
TPS improved performance by 5%. This is understandable, since TPS is possible through a background process which is scanning memory, and this consumes a modest amount of CPU. However, the performance impact of TPS was only visible with full CPU loads. TPS has no performance impact under normal conditions.
Before the update of this whitepaper, Project VRC concluded: when it is the primary objective to maximize the amount of users with TS workloads and there is enough physical memory available, it is recommended to disable TPS. However, this VRC recommendation should not be understood as an overall recommendation to disable TPS. For instance, when maximizing the amount of VM’s is the main goal (this is quite common, e.g. VDI and rather typical server consolidation efforts), TPS can be very helpful. It is important to note that VMware does not recommend disabling TPS, their publications have shown TPS does not impact performance.

Zdroj:
Virtual Reality Check
http://www.projectvrc.nl/

štvrtok 16. septembra 2010

Bezpecnost VLAN

Try not to use VLANs as a mechanism for enforcing security policy. They are great for segmenting networks, reducing broadcasts and collisions and so forth, but not as a security tool.

If you MUST use them in a security context, ensure that the trunking ports have a unique native VLAN number.

Zdroj:
Intrusion Detection FAQ: Are there Vulnerabilites in VLAN Implementations? VLAN Security Test Report
http://www.sans.org/security-resources/idfaq/vlan.php

The security of VLAN technology has proven to be far more reliable than its detractors had hoped for and only user misconfiguration or improper use of features have been pointed out as ways to undermine its robustness.

The most serious mistake that a user can make is to underestimate the importance of the Data Link layer, and of VLANs in particular, in the sophisticated architecture of switched networks. It should not be forgotten that the OSI stack is only as robust as its weakest link, and that therefore an equal amount of attention should be paid to any of its layers so as to make sure that its entire structure is sound.

Zdroj:
VLAN Security White Paper
http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a008013159f.shtml

pondelok 13. septembra 2010

Ako rozbehat vSphere Command-Line Interface (vSphere CLI)

Pomoc pri chybovej hlaske pri prvom pouziti vSphere Command-Line Interface (vSphere CLI) z Microsoft Windows prostredia.

* Open your CLI command prompt as Administrator. Type ppm and hit enter (Perl Package Manager).
* Now look for a module called Crypt-SSLeay. You’ll see that CLI’s bundled ActivePerl distribution includes version 0.53, but there is a newer version 0.57 available:ActiveState Perl PPM
* Remove this as shown, then go to File -> Run Marked Actions
* Click on the grey box icon on the left of the toolbar. These are available packages which are not currently installed. Search for Crypt-SSLeay once again, install, and Run Marked Actions. Exit.


Nasledne mozete skusit priklad prveho prikazu:

vicfg-nics.pl --server serveresxbratislava --username "fero" --password "alfanumerickyretazec" --list

Prikaz vypise fyzicke sietove adaptery.

Zdroj:
vSphere Command-Line Interface Documentation
http://www.vmware.com/support/developer/vcli/

vSphere CLI libeay32.dll error on Windows
http://pcloadletter.co.uk/2010/07/27/vsphere-cli-libeay32-dll-error/

Sledujte VMware User Group, takzvanu VMUG: Slovak User Group

Pozdravujem vsetkych,
chcem len upozornit na existenciu VMware User Group, takzvanu VMUG: Slovak User Group
Prispel som do tejto komunity nasledujucim navrhom:
http://communities.vmware.com/thread/284315

Link na VMware User Group, takzvanu VMUG: Slovak User Group:
http://communities.vmware.com/community/vmug/forums/emea/slovak

utorok 24. augusta 2010

Ako funguje VMware Fault Tolerance v momente vypadku primarneho servera?

Otazka:
What exactly happens on level of application in virtualized VM when there is big latency vLockstep interval and suddently primary crash? How this execution lag is replayed on secondary when primary is suddently down? Is it like even there is big vLockstep execution lag secondary host got logs locally everytime prepared for execution but it is waiting locally on secondary for real execution of logs on its CPU, is it like that?

Odpoved:
"If the primary VM fails, the backup VM should similarly go live, but the process is a bit more complex. Because of its lag in execution, the backup VM will likely have a number of log entries that it has received and acknowledged, but have not yet been consumed because the backup VM hasn’t reached the appropriate point in its execution yet. The backup VM must continue replaying its execution from the log entries until it has consumed the last log entry. At that point, the backup VM will stop replaying mode and start executing as a normal VM."

"Since it is no longer a backup VM, the new primary VM will now produce output to the external world when the guest OS does output operations. During the transition to normal mode, there may be some device-specific operations needed to allow this output to occur properly. In particular, for the purposes of networking, VMware FT automatically advertises the MAC address of the new primary VM on the network, so that physical network switches will know on what server the new primary VM is located. In addition, the newly promoted primary VM may need to reissue some disk IOs."

Este pripajam vysvetlenie od Krishna Raj Raja, Senior MTS, Performance Group
VMware Inc :

The vLockstep protocol ensures that the secondary always "at the least" has all the information to continue from the point where the primary made its last externally visible I/O (i.e. the last network transmit or disk write). This is accomplished by delaying (i.e. holding) network transmit and disk write operating at the primary until the the primary gets an acknowledgement from the secondary that it has recieved events preceding to the I/O (Note this doesnt mean that the secondary's execution has to be current, secondary still be could lagging in execution and could running from the log buffer)

So if primary dies at any point, the secondary will continue to run in replay mode until the log buffer becomes empty and then it will go live. From that point onwards the secondary will have its own non-deterministic execution. All of this is transparent to the guest OS and applications running on top of it. From the Application and operating system perspective nothing is changed, its always running and it wouldnt know if the execution has changed from determinstic mode to non-dterministic mode. From the client (the machine that is communicating with the VM) perspective, there would be a slight delay when the failover happens while the secondary catches up to the golive point (because secondary doesnt do any external I/O when it is not live). This delay is what we term as vLockstep interval. It would appear to the client that the VM is taking more time to respond. To avoid excessive delays we always make sure that the vLockstep interval never exceeds 1 sec. If it exceeds one sec, we throttle the primary to slowdown so that the secondary can catch up.

piatok 6. augusta 2010

Ako vypocitat pocet I/O operacii disku?

From time to time i spend time looking for information how to calculate IOPS (or IO operations per second) for hard drives.

In the following table i outlined how to calculate the IOPS a disk can handle. For this calculation you need at least TWO values. Most of the time you will be given with the RPMs (rotations per minute) and average seek time. You can also cope with rotation latency or IO time – you just need to play with the formulas.



Based on this approach you can easily calculate the data transfer rate (MBPS):



As you can easily see larger IO requests yield to a higher throughput in MBPS. Due to this fact oracle recommends the SAME principle (SAME = stripe and mirror everything) with a stripe size of 1 MB.

Zdroj:
Calculate IOPS (or IO/s) for hard disks
http://ronnyegner.wordpress.com/2009/09/23/calculate-iops-or-ios-for-hard-disks/

streda 4. augusta 2010

Co podstatne je nove vo VMware vSphere 4.1?

Zdroj:
Huge amount of VMware updates
http://www.ivobeerens.nl/?p=403


http://searchvirtualdatacentre.techtarget.co.uk/news/column/0,294698,sid203_gci1516576,00.html

Dalsie zdroje:
Changes to vMotion in vSphere 4.1
http://kb.vmware.com/kb/1022851

vStorage APIs for Array Integration

vStorage APIs for Array Integration v podani NetApp.

http://www.youtube.com/watch?v=fryVOg7ohAk


Dalsie zdroje:
Dobry clanok pre pochopenie VAAI, cize vStorage APIs for Array Integration
http://www.virtuallyghetto.com/2010/07/script-automate-vaai-configurations-in.html

vSphere API pre storage

How does VDDK compare with vStorage APIs for Data Protection?

The vStorage APIs for Data Protection (VADP) is a backup framework that enables off-host, efficient, centralized backup and restore of vSphere virtual machines. The VDDK is focused on efficient access and transfer of data on virtual disk storage. It is one of the two key components of vStorage APIs for Data Protection. The VDDK can be used in conjunction with other key VADP component (vSphere SDK) as a framework to enable efficient backup and restore of vSphere virtual machines.

Zdroj:
http://www.vmware.com/support/developer/vddk/VDDK-1.2-FAQ.html

Vysvetlenie zahadnych skratiek v popise zalicencovanych funkcionalit vSphere 4.1

Zdroj: http://www.virtuallyghetto.com/2010/08/vmware-api-related-acronyms.html

štvrtok 8. júla 2010

Licencovanie a management VMware ESXi 3.5 free edition a standalone VMware ESXi 4.x

Ak je potrebne Standalone VMware ESXi 4.x managovat cez VMware vCenter, je potrebne zakupit vSphere licencie na dany pocet Socketov ESXi serverov.

Informacia z VMware Knowledge Base:

Adding an ESXi host to vCenter Server 4.0 fails with the error: Host cannot be added to the VCenter as there are not enough Virtual Center Agent Licenses

If an ESXi host is licensed with the free version of the license key, you cannot add it to vCenter Server. This license does not contain the VirtualCenter Agent, which is necessary to manage a host with vCenter Server. This feature remains locked as long as the host is licensed with the free version of the license key.

To add ESXi hosts to vCenter Server, you must license the ESXi hosts with:

* vSphere Essentials

* vSphere Standard, Advanced, Enterprise, or Enterprise Plus


Zdroj:
Adding an ESXi host to vCenter Server 4.0 fails with the error: Host cannot be added to the VCenter as there are not enough Virtual Center Agent Licenses
http://kb.vmware.com/kb/1018275

VMware KB linky pre zalicencovanie standalone VMware ESXi 4.x a ESXi 3.5 free edition:

Licensing ESX 4.0, ESXi 4.0, and vCenter Server 4.0
http://kb.vmware.com/kb/1010839

Licensing the free edition of ESXi 3.5
http://kb.vmware.com/kb/1006481

Managing an ESXi host with the ESXi Management kit
Tu upozornujem na fakt, ze ESXi Management Kit je nahradeny novou generaciou tzv. vSphere Essentials.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1011567

streda 7. júla 2010

Zaujimave produkty od VMware

Zaujimave nove produkty pre VMware vCenter:

VMware vCenter Configuration Manager

(formerly EMC Ionix Server Configuration Manager) policy-driven automation detects deep system changes and identifies whether that change is within policy - an expected and acceptable behavior based on industry, regulatory, or your own self-defined best practices - or whether that change has created a compliance violation or security vulnerability.

http://www.vmware.com/products/configuration-manager/

Automate configuration management across virtual and physical servers, workstations, and desktops with VMware vCenter Configuration Manager. Increase efficiency by eliminating manual, error-prone and time-consuming work.

• Avoid configuration drift by automatically detecting and comparing changes to policies
• Maintain continuous compliance with out-of-the box templates and toolkits
• Automate and optimize server provisioning and application stack deployment in the datacenter

Also take a look at:

VMware vCenter Application Discovery Manager

Quickly and accurately map your application dependencies to accelerate datacenter moves, precisely plan infrastructure consolidations and confidently virtualize your business critical applications.

VMware Service Manager

VMware Service Manager develops a 100% web architected solution that automates IT Service Management processes in enterprise organizations.VMware Service Manager is independently verified to the highest level of ITIL compatibility for Incident Management, Problem Management, Change Management, Release Management, Configuration Management, Service Level Management and Availability Management.

Zdroj: http://www.ntpro.nl/blog/archives/1530-New-VMware-vCenter-Configuration-Service-and-Application-Manager.html

Testovanie pamate ESX servera

Zdroj: http://www.ntpro.nl/blog/archives/1529-Memory-Performance-Tester.html

utorok 6. júla 2010

Kompresia dat, deduplikacia a Single Instance Storage

Single Instance Storage

'Single Instance Storage' is sometimes referred to by some as 'File Level Deduplicaiton'. 'Single Instance Storage' refers to the ability of a file system (or data storage container) to identify two or more identical files and to retain the multiple external references of the file while storing a single copy on disk.

Many of us have used or accessed technologies which leverage 'Single Instance Storage' as it has been the primary storage savings technology with Microsoft Exchange Server 5.5, 2000, 2003, & 2007. If you are familiar with Exchange Server you probably recall that the ability of 'Single Instance Storage' to reduce file redundancy is limited to the content within an Exchange database (or mailstore). In other words, multiple copies of a file may exist, but each individual database will only maintain a single copy.

Are you aware that Exchange Server 2010 has discontinued support for 'Single Instance Storage'? Seems Microsoft has left it to the storage vendors to provide capacity savings.

Data Deduplication

'Data Deduplication' is best described as block, or sub-file level deduplication, which is the ability to reduce the redundancy in two or more files which are not identical. Historically the storage and backup industries have used the term 'Data Deduplicaiton' specifically to mean the reduction of data at the sub-file level. I'm sure many of you use technologies which include 'Data Deduplication' such as systems from NetApp, Data Domain, or Sun Microsystems.

With 'Data Deduplication' data is stored in the same format as if it was not deduplicated with the exception that multiple files share storage blocks between them. This design allows the storage system to serve data without any additional processing prior to transferring the data to the requesting host.

In summary, 'Data Deduplicaiton' is an advanced form of 'Single Instance Storage'. It exceeds the storage savings provided by 'Single Instance Storage' by deduplicating both identical and dissimilar data sets.

Data Compression

Probably the most mature technology of the bunch is 'Data Compression'. I'm sure we are all familiar with this technology as we use it every day in transferring files (ala WinZip) or maybe even dabbled with NTFS compression with some of your Windows systems.

In the example below we have two virtual machines, each running the same Guest Operating System, yet unique objects in their security realm, and storing dissimilar data sets. This example represents common deployments of VMware, KVM, Hyper-V, etc... With 'Data Compression' the data comprising the VMs is rewritten into a dense format on the array. There is no requirement for the data to be common between any objects.

As data which has been compressed is not stored in a format which can be directly accessible by the requesting host, it falls onto the storage controller to decompress the data prior to serving it to a host. This process will add latency to the storage operations.

Many of you may be surprised to know that NetApp arrays provide both 'Data Deduplicaiton' and 'Data Compression'. I'll share more on the later in my next post; however, relative to this discussion I can share with you that while we see performance increases with 'Data Deduplication', 'Data Compression' does add an additional performance tax to the storage system.

Note, these technologies are mutually inclusive, so compressed data sets gain the advantage of TSCS to help offset the performance tax.

In summary, 'Data Compression' is a stalwart of storage savings technologies which can provide savings unavailable with 'Single Instance Storage' or 'Data Deduplication'. Because of the performance tax of 'Data Compression' one should restrict it's usage to data archives and NAS file services.

Wrapping Up This Post

Storage savings technologies are all the rage of the storage and backup industries. While every vendor has their own set of capabilities, it is in the best interest for any architect, administrator, or manager of data center operations to have a clear understanding of which technology will provide benefits to which data sets before enabling these technologies. Saving storage while impeding the performance of a production environment is a sure-fire means to updating one's resume.

Suffice to say these technologies are here, and they are reshaping our data centers. I hope this post will help you to better understand what your storage vendor means when he or she states that they offer 'deduplicaiton'.


Zdroj: http://blogs.netapp.com/virtualstorageguy/2010/06/data-compression-deduplication-single-instance-storage.html

O com je koncept ESXi?

Zdroj: http://www.virtualinsanity.com/index.php/2010/07/02/understanding-esxesxi-equivalency-are-we-there-yet/

Preco niesu vo vCenter Server 4.0 Update 1 doporucene privilegia pri tvorbe roli?

Zdroj: http://kb.vmware.com/kb/1018261

utorok 29. júna 2010

NUMA architektura

VMware ESX is NUMA aware and will schedule all of a VM's vCPUs on a ‘home’ NUMA node. However, if the VM container size (vCPU and RAM) is larger than the size of a NUMA node on the physical host, NUMA crosstalk will occur. It is recommended, but not required, to configure your maximum Zimbra VM container size to fit on a single NUMA node.

For Example

* ESX host with 4 sockets, 4 cores per socket, and 64 GB of RAM
* NUMA nodes are 4 cores with 16 GB of RAM (1 socket and local memory)
* Recommended maximum VM container is 4 vCPU with 16GB of RAM

Zdroj:
http://wiki.zimbra.com/wiki/Performance_Recommendations_for_Virtualizing_Zimbra_with_VMware_vSphere_4

"After all, access to remote memory may take three to five times as long as access to local memory even in a well designed NUMA system."

Zdroj:
NUMA: Theory and Practice
http://practical-tech.com/infrastructure/numa-theory-and-practice/

Turning on NUMA merely makes the kernel aware of the topology and lets it optimize its own memory placement. If applications want to have any useful performance then they either need to use the NUMA library for requesting memory, or the the administrator needs to give them explicit NUMA policy using the numactl tool.

If you make use of numactl in RHEL-4 you'll get very good benefits from NUMA. If you don't use numactl, then you probably are best off using interleaving, because that will give predictable/consistent performance.

NUMA wins: 1) if your application can run within only 1-numa-nodes of memory, default scheduler or use NUMActl to bind. 2) for multi-process application with small memory footprints. 3) if your application is aware of Linux NUMA APIs (DB2 and a few others)

interleave is better: 1) memory usage greatly exceeds 1-numa-nodes worth of memory (ie large database caches etc) 2) the Linux pagecache is not NUMAized, if heavy file I/O (not-directIO) with large files 3) analyze numastat output in RHEL4 and if your numa_miss are > factor of 10 compared to numa_hit.

Zdroj:
RHEL 4 and Intel vs. Opteron performance
http://markmail.org/message/vfwjd7rysqt2jahb

Both IBM and AMD NUMA systems are supported in ESX 3.0. AMD Opteron-based systems, such as the HP ProLiant DL585 Server, can be configured as NUMA architecture. The BIOS setting for node interleaving determines whether the system behaves more like a NUMA system, or more like a Uniform Memory Architecture (UMA) system. For more information, see the HP ProLiant DL585 Server Technology manual.

If node interleaving is disabled ESX Server detects the system as NUMA and applies NUMA optimizations. If you enable node interleaving (also known as interleaved memory), ESX Server does not detect the system as NUMA. The intelligent, adaptive NUMA scheduling and memory placement policies in VMware ESX Server 3 can manage all virtual machines transparently, so that administrators do not need to deal with the complexity of balancing virtual machines between nodes by hand. However, manual override controls are also available and advanced administrators may prefer to control the memory placement (through the Memory Affinity option) and processor utilization (through the Only Use Processors option) by hand. This may be useful, for example, if a virtual machine runs a memory-intensive workload, such as an in-memory database or a scientific computing application with a large data set. Such an application may see performance improvements if 100% of its memory is allocated locally, while virtual machines managed by the automatic NUMA optimizations often have a small percentage (5-15%) of their memory located remotely. An administrator may also wish to optimize NUMA placements manually if the system workload is known to be simple and unchanging; for example, an eight processor system running eight virtual machines with similar workloads would be easy to optimize by hand. Keep in mind if you manually set the processor or memory affinity for a virtual machine, the NUMA scheduler may not be able to automatically manage this virtual machine.

Please be aware that ESX NUMA scheduling and related optimizations are enabled only on dual-core systems with a total of at least four cores. On such ESX NUMA scheduling enabled systems, for ideal performance ensure that for each virtual machine # of VCPUs + 1 <= # of cores per node. Virtual machines that are not managed automatically by the NUMA scheduler (single core and/or less then total of four cores) still run fine; they simply don't benefit from ESX Server's NUMA optimizations.

Zdroj:
Performance Tuning Best Practices for ESX Server 3
http://www.vmware.com/pdf/vi_performance_tuning.pdf

Dalsie zdroje:

Optimizing Software Applications for NUMA - Intel® Software Network
http://software.intel.com/en-us/articles/optimizing-software-applications-for-numa/

IMPACT OF NUMA EFFECTS ON HIGH-SPEED NETWORKING WITH MULTI-OPTERON MACHINES
http://hal.archives-ouvertes.fr/docs/00/17/57/47/PDF/PDCS07.pdf

Calling It: NUMA Will Be The Shizzle In Two Years - Network Computing
http://www.networkcomputing.com/virtualization/calling-it-numa-will-be-the-shizzle-in-two-years.php

Local and Remote Memory: Memory in a Linux/ NUMA System
http://kernel.org/pub/linux/kernel/people/christoph/pmig/numamemory.pdf

VMware vSphere NUMA Imbalance Error when Upgrading from ESX 3.5 to vSphere 4
Zdroj: http://blog.lewan.com/2010/05/25/vmware-vsphere-numa-imbalance-error-when-upgrading-from-esx-3-5-to-vsphere-4/

sobota 19. júna 2010

Kontrolujte swap

Ak nastane problem s vykonom ESX servra, prve co treba skontrolovat v esxtop je parameter
SWR/s (J) = If larger than zero the ESX host is actively reading from swap(vswp).

pondelok 31. mája 2010

F5 ponuka Long distance VMotion

Pekne video o Long distance VMotion rieseni od F5 pre VMware prostredie.



Introducing: Long Distance VMotion with VMWare
http://devcentral.f5.com/weblogs/nojan/archive/2010/02/02/introducing-long-distance-vmotion-with-vmware.aspx

Gartner hovori: VMware je jasny lider virtualizacie

Najnovsi Gardner pre virtualizaciu hovori:
“VMware stands alone as a leader in this Magic Quadrant”


“VMware is clearly ahead in”:
Understanding the market
Product strategy
Business model
Technology innovation, Product capabilities
Sales execution

“VMware Strengths”:
Far-reaching virtualization strategy enabling cloud computing, new application architectures and broader management
Technology leadership and innovation
High customer satisfaction
Large installed base (especially Global 2000), and rapid growth of service providers planning to use VMware (vCloud)

Ako funguje snapshot a ako riesit problemy v pripade jeho pouzitia

Super online navod na riesenie problemov v suvislosti s pouzitim snapshotov.

Zdroj: http://geosub.es/vmutils/Troubleshooting.Virtual.Machine.snapshot.problems/Troubleshooting.Virtual.Machine.snapshot.problems.html

Cisco UCS - interaktivne zobrazenie platformy

Zaujimavy link na interaktivnu Cisco stranku, kde si mozete pozriet, ako fyzicky vyzera Cisco UCS platforma.

Zdroj: http://www.cisco.com/en/US/prod/ps10265/ps10279/ucs_kaon_model_preso.html

štvrtok 27. mája 2010

Paravirtualizovany SCSI adapter

"VMware’s new paravirtualized SCSI adapter (pvSCSI) offered 12% improvement in throughput at 18% less CPU cost compared to LSI virtual adapter"

Zdroj: http://blogs.vmware.com/performance/2009/05/350000-io-operations-per-second-one-vsphere-host-with-30-efds.html

VMFS resignaturing

VMware dokument ohladne zmien signatur pri snapshot, replikaciach LUN volumes.

Zdroj:
VMware VMFS Volume Management: http://www.vmware.com/files/pdf/vmfs_resig.pdf

Fibre Channel Zoning

Vyborne linky ohladne Fibre Zoning

Zdroje:

Single initiator zoning http://www.yellow-bricks.com/2008/10/28/single-initiator-zoning/

Tech Target Fibre zoning http://searchstorage.techtarget.com/tip/1,289483,sid5_gci881375,00.html

Storage Networking 101: Understanding Fibre Channel Zones http://www.enterprisenetworkingplanet.com/netsp/article.php/3695836

Ako zmenit nazov priecinka, suborov virtualneho pocitaca?

Zmenu nazvu priecinka a suborov podla nazvu, ktory ste nadefinovali vo vCenter zrealizujete tak, ze urobite Storage VMotion. Na cieli bude nazov priecinka a suborov zmeneny podla VM nazvu z vCenter.

Test konsolidacie storage vo virtualizovanom prostredi za pomoci nastroja vscsiStats

DVDStore version 2.0 is an online e-commerce test application with a backend database component, and a client program to generate workload. We used the largest dataset option for DVDStore (100 GB), which includes 200 million customers, 10 million orders/month and 1 million products. The server ran in a RHEL4-U4 64 bit VM with 4 CPUs, 32 GB of memory and a storage backend of 5 disk RAID 5 configuration.



Doporucujem pozriet sa i na NetApp tool Virtualization Data Collection Tool.

Zdroj:
Storage Workload Characterization and Consolidation in Virtualized Enviornments http://communities.vmware.com/docs/DOC-10104

Analyza vykonu Storage za pomoci VMware utilitky vscsiStats

esxtop is a great tool for performance analysis of all types. However, with only latency and throughput statistics, esxtop will not provide the full picture of the storage profile. Furthermore, esxtop only provides latency numbers for Fibre Channel and iSCSI storage. Latency analysis of NFS traffic is not possible with esxtop.

Since ESX 3.5, VMware has provided a tool specifically for profiling storage: vscsiStats. vscsiStats collects and reports counters on storage activity. Its data is collected at the virtual SCSI device level in the kernel. This means that results are reported per VMDK (or RDM) irrespective of the underlying storage protocol. The following data are reported in histogram form:

* IO size
* Seek distance
* Outstanding IOs
* Latency (in microseconds)
* More!



Zdroj:
Using vscsiStats for Storage Performance Analysis http://communities.vmware.com/docs/DOC-10095

Na co si dat pozor pri navrhu storage

Things that affect scalability

Throughput

* Fibre Channel link speed
* Number of outstanding I/O requests
* Number of disk spindles
* RAID type
* SCSI reservations
* Caching or prefetching algorithms

Latency

* Queue depth or capacity at various levels
* I/O request size
* Disk properties such as rotational, seek, and access delays
* SCSI reservations
* Caching or prefetching algorithms.

Factors affecting scalability of ESX storage

Number of active commands

* SCSI device drivers have configurable parameter called LUN queue depth which determines how many commands can be active to a given LUN at any one time.
* QLogic fibre channel HBAs support up to outstanding commands 256, Emulex 128
* Default value in ESX is set to 32 for both
* Any excess commands are queued in vmkernel which increases latency
* When VMs share a LUN, the total number of outstanding commands permitted from all VMs to that LUN is goverened by Disk.SchedNumReqOutstanding. If this is exceeded, commands will be queued in VMkernel. Maximum figure recommended is 64. For LUNs with single VM, this figure is inapplicable, and HBA queue depth is used.
* Disk.SchedNumReqOutstanding should be the same value as the LUN queue depth.
* n = Maximum Outstanding I/O Recommended for array per LUN (this figure should be obtained with help from the storage vendor)
* a = Average active SCSI Commands per VM to shared VMFS
* d = LUN queue depth on each ESX host
* Max number VMs per ESX host on shared VMFS = d/a
* Max number VMs on shared VMFS = n/a
* To establish a look at QSTATS in esxtop, and add active commands to queued commands to get total number of outstanding commands.

SCSI Reservations

* Reservations are created by creating/deleting virtual disks, extending VMFS volume, creating/deleting snapshots. all these result in metadata updates to the file system using locks.
* Recommendation is to minimise these activities during the working day.
* Perform these tasks on the same ESX host that hosts I/O intensive VMs as the SCSI reservations are issued by the same host as there will be no reservation conflicts as the host is already generating the reservations. I/O intensive VMs on other hosts will be affected for the duration of the task.
* Limit the use of snapshots. It is not recommended to run many virtual machines from multiple servers that are using virtual disk snapshots on the same VMFS. Snapshot files grow in 16MB chunks, so for vmdks with lots of changes, this file will grow quickly, and for every 16MB chunk that the file grows by, you will get a SCSI reservation.

Zdroj:
Andy Troup, employed by VMware as a Senior Consultant and am the EMEA Strategy & Operations Practice Lead. http://virtuallyandy.blogspot.com/2009/03/storage-best-practice.html
Scalable Storage Performance http://www.vmware.com/files/pdf/scalable_storage_performance.pdf

piatok 21. mája 2010

Sprava pamate vo VMware® ESX™ Server

In order to quickly monitor virtual machine memory usage, the VMware vSphere™ Client exposes two memory statistics in the resource summary: Consumed Host Memory and Active Guest Memory.



Consumed Host Memory usage is defined as the amount of host memory that is allocated to the virtual machine, Active Guest Memory is defined as the amount of guest memory that is currently being used by the guest operating system and its applications.
These two statistics are quite useful for analyzing the memory status of the virtual machine and providing hints to address potential performance issues.

This article helps answer these questions:
• Why is the Consumed Host Memory so high?
• Why is the Consumed Host Memory usage sometimes much larger than the Active Guest Memory?
• Why is the Active Guest Memory different from what is seen inside the guest operating system?

Terminology

The following terminology is used throughout this paper.
• Host physical memory refers to the memory that is visible to the hypervisor as available on the system.
• Guest physical memory refers to the memory that is visible to the guest operating system running in the virtual machine.
• Guest virtual memory refers to a continuous virtual address space presented by the guest operating system to applications. It is the memory that is visible to the applications running inside the virtual machine.
• Guest physical memory is backed by host physical memory, which means the hypervisor provides a mapping from the guest to the host memory.
• The memory transfer between the guest physical memory and the guest swap device is referred to as guest level paging and is driven by the guest operating system. The memory transfer between guest physical memory and the host swap device is referred
to as hypervisor swapping, which is driven by the hypervisor.

Memory Virtualization Basics

Virtual memory is a well-known technique used in most general-purpose operating systems, and almost all modern processors have hardware to support it. Virtual memory creates a uniform virtual address space for applications and allows the operating system and hardware to handle the address translation between the virtual address space and the physical address space. This technique not only
simplifies the programmer’s work, but also adapts the execution environment to support large address spaces, process protection, file mapping, and swapping in modern computer systems.
When running a virtual machine, the hypervisor creates a contiguous addressable memory space for the virtual machine. This memory space has the same properties as the virtual address space presented to the applications by the guest operating system. This allows the hypervisor to run multiple virtual machines simultaneously while protecting the memory of each virtual machine from being accessed by others. Therefore, from the view of the application running inside the virtual machine, the hypervisor adds an extra level of address translation that maps the guest physical address to the host physical address. As a result, there are three virtual
memory layers in ESX: guest virtual memory, guest physical memory, and host physical memory. Their relationships are illustrated in Figure 2 (a).



As shown in Figure 2 (b), in ESX, the address translation between guest physical memory and host physical memory is maintained by the hypervisor using a physical memory mapping data structure, or pmap, for each virtual machine. The hypervisor intercepts all virtual machine instructions that manipulate the hardware translation lookaside buffer (TLB) contents or guest operating system page tables, which contain the virtual to physical address mapping. The actual hardware TLB state is updated based on the separate shadow page tables, which contain the guest virtual to host physical address mapping. The shadow page tables maintain consistency with the guest virtual to guest physical address mapping in the guest page tables and the guest physical to host physical address mapping in the pmap data structure. This approach removes the virtualization overhead for the virtual machine’s normal memory accesses because the hardware TLB will cache the direct guest virtual to host physical memory address translations read from the shadow page tables. Note that the extra level of guest physical to host physical memory indirection is extremely powerful in the virtualization environment. For example, ESX can easily remap a virtual machine’s host physical memory to files or other devices in a manner that is completely transparent to the virtual machine.

Recently, some new generation CPUs, such as third generation AMD Opteron and Intel Xeon 5500 series processors, have provided hardware support for memory virtualization by using two layers of page tables in hardware. One layer stores the guest virtual to guest physical memory address translation, and the other layer stores the guest physical to host physical memory address translation. These two page tables are synchronized using processor hardware. Hardware support memory virtualization eliminates the overhead required to keep shadow page tables in synchronization with guest page tables in software memory virtualization.

Although the hypervisor cannot reclaim host memory when the operating system frees guest physical memory, this does not mean
that the host memory, no matter how large it is, will be used up by a virtual machine when the virtual machine repeatedly allocates and frees memory. This is because the hypervisor does not allocate host physical memory on every virtual machine’s memory allocation.
It only allocates host physical memory when the virtual machine touches the physical memory that it has never touched before. If a virtual machine frequently allocates and frees memory, presumably the same guest physical memory is being allocated and freed again and again. Therefore, the hypervisor just allocates host physical memory for the first memory allocation and then the guest reuses the same host physical memory for the rest of allocations. That is, if a virtual machine’s entire guest physical memory (configured memory) has been backed by the host physical memory, the hypervisor does not need to allocate any host physical memory for this virtual machine any more.

Memory Reclamation in ESX

ESX supports memory overcommitment from the very first version, due to two important benefits it provides:

- with memory overcommitment, ESX ensures that host memory is consumed by active guest memory as much as possible

- With memory overcommitment, each virtual machine has a smaller footprint in host memory usage, making it possible to fit more virtual machines on the host while still achieving good performance

streda 19. mája 2010

Typy virtualnych sietovych adapterov

vNIC Types on ESX

Four basic vNICs:

Two emulated types:
Vlance – emulation of physical AMD very old network device. Device, PCNET32 linux drivers - emulation of real physical device

E1000 – Intel Card emulation of real physical device

Reason – OS drivers on OS install CD


Other are VMware devices - designed with emulation in mind
Vmxnet2/enhanced- vmxnet2 (ESX3.5)
Vmxnet3 (vSphere)

“Flexible” vNIC: Morphablein Windows and Linux VMs
Combination Two devices in one:
virtual HW version 4 (ESX 3.x): vlance+ vmxnet2
virtual HW version 7 (ESX 4.0): vlance+ enhanced vmxnet2
Operates as vlanceinitially, but “morphs” into vmxnet2/enhanced vmxnet2 if VMware-tools is installed.

vNIC Features



Vlance – no features, very old!
Vmxnet3 – all features – highest performance,flexibility, RSS multiplexing traffic to multiple vCPUs in Windows 2k8
In the middle of functionalities there is vmxnet2 resp. enhanced vmxnet2 – TSO support, Jumbo Frames are the differences between those.

Notes:
vHWversion 4: ESX 3.x, ESX 4.0
vHWversion 7: ESX 4.0
* ESX 3.5 and later only


vNIC Selection on ESX

vmxnet3 gives the best overall performance today!!! Guest drivers Linux, Win, Solaris

Avoid vlanceif possible
Install VMware tools to morph it into vmxnet2/enhanced vmxnet2!!!!

e1000 vNIC
Good compromise between performance and driver support in Guest OS installation CD, and across Guest OS types, better usability during the install than vmxnet2,3. Performance is reasonable, but not as vmxnet3, or enhanced vmxnet2.

Why vmxnet3 not default? E1000 is default most of times of OSs - reason vmxnet not driver in Install CD, currently vmxnet driver is not on Install CD of OSs. That is reason why VMware recommend E1000 as default.

If Non TCP traffic - If larger Rx ring is needed
E1000 vNIC or vmxnet3 vNIC, they got Larger deafault ring sisez, most of time changeable from OSs.
Larger default rxring sizes; size adjustable in most cases.


Conclusion

TCP vNIC traffic does very well.
Very high aggregate throughput and packet rate achievable.
If your application predominantly uses TCP, you should not worry about impact vNIC networking unless you need many Gbps throughput per vNIC.
Few workload even come close to needing > 2Gbps or > 200k pks/s!

At higher data rate, UDP traffic may need larger vNIC Rx ring.
Larger receive socket size may also be needed.
Depending on packet rate, burst rate and loss rate tolerable, may need to watch CPU and memory over-commitment levels.

Low jitter and very low latency requirements
Where work is going on.
Early recommendation: use EPT or NPT support on processors AMD to vola RVI.

Lot of times people don’t have real application for relevant load demand!!!! Nothing worry about!!!! :)

MTU size – change Jumbo frames- make sure that JF are set on switches!!! It can cause problems. For example you can ping but no transfers.
Only vmxnet3, enhanced vmxnet2. Not E1000, apart of physical E1000 which supports Jumbo Frames, but not implemented in virtual E1000 yet.

Zdroj: Virtual Network Performance
http://www.vmworld.com/docs/DOC-3875

Vykon 10GB virtualneho sietovania pre virtualizovane Windows a Linux pocitace

vNIC Networking

Existing performance improvement techniques:

Minimize copying during Tx
Make use of TSO (TCP Segmentation Offload)
Moderate virtual interrupt rate, heuristic
Generally reduce the number of “VMExits”
NetQueue for scaling with multiple vNICs
Limited use of LRO (for Guest OS that supports it)

VMDirectPath technology

Direct VM access to device hardware
FPT --Fixed Passsthroughin ESX 4.0, not VMotionable

Ways of Measuring Virtual Networking Performance

Metrics

Bandwidth
Packet rate, Particularly when packet sizes are small

Scaling within VM
Increase number of connections
Increase number of vCPUs

Scaling across VMs
Increase number of VM

Test Platform Systems:

ESX
2-socket, Quad-core Intel Xeon X5560 @ 2.80 GHz (Nehalem) system
Each core has L1 and 256KB L2 caches
Each socket has shared 8MB L3 cache
6 GB RAM (DDR 3 -1066 MHz)
pNIC: Intel 82598EB (Oplin) 10GigE, 8x PCIe
ESX 4.0

Other machine
2-socket Intel Xeon X5335 @ 2.66 GHz (Clovertown)
RHEL 5.1
Intel Oplin10GigE NIC (ixgbe; version 1.3.16.1-lro:8 RxQs, 1 TxQ
16GB RAM

Microbenchmark:
Netperf, 5 TCP connections

Single vNIC TCP Performance: Linux VM



Results with RHEL5 VM:

Test configs:
Spectrum of socket and message sizes

Txand Rx both reach ~9Gbps (~wirespeed) with 64kB and or auto-tuned socket sizes
Rx bandwidth of 9+Gbps => over 800k Rx pkts/s (std MTU size pkts)
Very small 8k socket size

Latency bound
reaches ~2Gbps throughput

Number of vCPU smakes little difference in micro-benchmark

Slight drop in Rx throughput going from 2 to 4 vCPUs due to cache effects
vSMP: additional CPU cycles for applications


Single vNIC TCP Performance: Windows VM


Results with Windows 2008 VM: (Enterprise Ed; SP1)

Very similar to Linux VM performance; key differences:

Windows Tx does not use auto-tuning
Rx throughput reaches peak of ~9Gbps with 2 vCPUs
Rx throughput higher then Linux at smaller socket sizes for vSMPs


TCP Throughput Scaling with # Connections


Results with Win2k8, 2-vCPU VM:

Large socket size runs:
Reach9+Gbps with very few connections (just a bit over 4)

Small socket size, moderate messages size:
- throughput continues to scale as number of sockets increase to 20
- Latency bound

Small socket, very small message size:
throughput flattens out, at close to 3Gbps for Rx, and close to 2Gbps for Tx


Multi-VM Scaling: RHEL5 UP VMs


VMs are UP, RHEL5 VMs

For large socket, 9+Gbps (wirespeed) sets the limit

Slight throughput increase going to 2 VMs
No throughput drop as more VMs are added

For small socket size, throughput scales as more VMs are added

For Txall the way through 8 VMs
For Rx, scaling flattens out after 4 to 5 VMs


In all cases, aggregate throughput exceed 5 Gbps

No scalability limit because of virtualization!!!! Only physical limit!!! :)


Multi-VM Scaling: Win2k8 UP VMs


VMs are UP, Win2k8 VMs

For large socket, 9+Gbps (wirespeed) sets the limit
Very similar to Linux VM case; differences:
Large socket size: slightly lower Rx throughput at single VM
Small socket, moderate message size (512): Rx scales extremely well, reaching 9+Gbps
Small socket sizes: Txthroughput somewhat lower than achieved with RHEL5 VM

In all cases, aggregate throughput exceed 4 Gbps

Key difference medium socket RX 8K-512 – Windows higher throughput than Linux– Windows got more acknowledgments


TSO’s Role in Tx Throughput


TSO plays significant role in Netperf Tx microbenchmarking
Large TSOs (>25kB avgsize) with Linux and auto-tuning of socket size
Beneficial even to small message and socket sizes for Linux when transmitting fast enough for aggregation

TSO very beneficial in virtual networking
zeroCopyTx+ largeTSOpacket: amortizes network virtualization overhead across a lot of data

Motivates looking at packet rate as additional performance metric

TSO - when socket is bigger. Win reverse effect lack of autotuning, bigger sizes than Linux.


Network Utilization of Sample Workloads



Very significant workloads at modest amount of network traffic!

Exchange server- loadgen – tool for Exchange benchmarking
TPC-C like benchmark – similar to TPC-C – huge CPU, transactions

Point is – high throughput of previous tests. Real aplication network throughput for Exchange is far lower!!!


Network Utilization of Sample Workloads (2)

SPECweb2005


SPECweb2005 3 modules:
Banking - SSL type of connection
E-Commerce- SSL and non SSL types of communication
Support – downloading patches, download manuals, etc.

People downloading: 2300, 3200, 2200

For support workload (highest network bandwidth workload)
Bandwidth usage highly skewed toward Tx bandwidth:
> 40 to 1 Tx to Rx bandwidth ratio
Tx traffic takes modest advantage of TSO (avg~3 x std MTU size)
Rx traffic has small pkts(avg~500 bytes) – mostly requests

Workload studies references:
Microsoft Exchange Server 2007 Performance on VMware vSphere™ 4, http://www.vmware.com/resources/techresources/10021
SPECweb2005 Performance on ESX Server 3.5, http://www.vmware.com/resources/techresources/1031

Zdroj: Virtual Network Performance
http://www.vmworld.com/docs/DOC-3875

utorok 18. mája 2010

Ako dosiahnut prenosovu rychlost 10+ Gbps pre SSH za pouzitia virtualizacie - Pripadova Studia

Test sa zaoberal problematikou ftp, scp & rsync pre data replikaciu/distribuciu.
Otazka znela: Je mozne dostat 10G z existujucich serverov?
Prvotny test 1G prenosu suboru vykazoval vysledok 40MB/s (320Mb/s).
10G prenos suboru bol s vysledkom 70MB/s (560Mb/s), co nieje 10x zlepsenie.

Je vseobecne jasny fakt, ze 10G siet negarantuje 10G applikacny prenos. Napriek tomu bola otazka, ci je mozne realizovat prenos lepsie.
Cielom bola maximalizacia nativneho a hlavne virtualizovaneho prenosu cez 10GbE.


Test na fyzickom hardware

Prva cast prebiehala na fyzickom hardware:
Xeon CPU X5560 @ 2.8 GHz (8 cores, 16 threads); SMT, NUMA, VT-x, VT-d, EIST, Turbo Enabled (default in BIOS); 24GB Memory; Intel 10GbE CX4 Server Adapter with VMDq

Pouzite aplikacie:
Netperf(common network micro-benchmark)
rsync(standard Linux file transfer utilities);
OpenSSH
HPN-SSH (optimized version of OpenSSH) – vyvyjana na Pittsburg SuperComputing Research Center (http://www.psc.edu/)
bbcp(“bit-torrent-like” file transfer utility)
– Stanfort SLAC National Accelerator Laboratory point-to-point network file copy application, like Torrent technology (http://www.slac.stanford.edu/~abh/bbcp/)

Testovaci OS RHEL 5.3 64-bit

Ramdiskused, not disk drives, focused on network I/O, not disk I/O.

Co bolo prenasane: Directory structure, part of Linux repository: ~8G total, ~5000 files, variable file size, average file size ~1.6MB
Nastroj pre zber udajov bol Linux utility “sar”: zber informacii ohladne prijatej prenasanej sietovej prevadzky a CPU vytazenia.

Vysledky Netperf testu:


Netperf odhalil nespravny slot zapojenia 10GbE karty. Pre optimalne vyuzitie 10GbE je nutne pouzit PCIe Gen1 x8.
Netperf je vyborna utilita na zistenie ci mate PCIe na 8x, cize dobre odkomunikovat s vendorom, ako su zapojene PCIe!
Netperf ukazal plny prenos 10GbE, co je ale len teoreticky, umely test.
Zakaznik ocakaval prenos do 600MB /s, kedze uz testoval FTP, ktore malo dany vykon.



Threads
SCP, RCYNC over SSH – aplikacie, vyvijane pre rokmi. Dnes je moznost vyuzivat viacero threads. SCP nedokaze pouzivat viact hreads.
SSH - SCP cez SSH pouziva one active thread, RSYNC ma dva active threads.
Dnes je na 10GbE moznych 16 threads.
SCP cez HPNSSH protocol - Pittsburg Supercomputing Center – bol vyvijany pre long distance high perf links zvacsenim buffer. 4 threads for crypto prevadzku.
Len malo Linux distribucii ma implementovanych HPNSSH.
BBCP – BitTorent – Stanford University – dokaze rozdelit velke packets na paralelnu prevadzku. No encrypt bulk transfer, just handshake. Pri ostatnych sa len mohlo vypnut crypto pre zvysenie vykonu.


Pri teste sa realizovalo zatazenie cez 8 streams.

Netperf iba chceckuje, SCP, RSYNC, BBCP robili realny transfer. Cielom bolo v testoch dostat sa na uroven vysledku NETPERF umeleho testu.

Test ukazal, ze kryptovanie vyrazne zvysuje CPU utilizaciu az na 90%!!! Bez crypto ide CPU na 50%. BBCP na 30%. Intel nabada vyvojarov aplikacnych nastrojovna pouzivanie viac threads. Podpora pre enkrypcne instrukcie je implementovana v novom CPU Intel Westmere napomoze k zlepseniu vykonu sietovej enkryptovanej prevadzky.

Nastavenia BIOS je potrebne zapnut – NUMA,SMT,Turbo – nevadia, troska vedia pomoct.
MultiQueue (VMDq queue virtualization) 16-32 queues for parallel tasking
5.3 enable for RX on RHEL, 6.0 bude mat .
TX is currently limited to one queue in RHEL, SLES 11RC supports MQ Tx

SCP,SSH 1 active thread
RSYNC – only 2 active thread!
HPNSSH – 4 crypto layers, MAC layer limitation, 2 crypto threads, takze 3 zo 16

Viacere paralelne streams su nutne pre prekonanie limitov aplikacii a nastrojov pre maximalny vykon.

Velkym limitom je hruba kryptograficka vykonnostna uroven.


Virtualized Case


Zmena oproti fyzickemu bola, ze jeden Virtualny Pocitac (VM) moze mat max. 8 vCPU.
Pri 8vCPU pri SCP,RSYNC bez kryptovania, ako aj pri BBCP klesal prenos z 9000Mbit na 5800Mbit. S pouzitim kryptografie, pri klasickom SCP, RSYNC cez protokol HPNSSH, alebo s pouzitim SSH je rozdiel medzi physical a virtual mensi. Pri krypto pri SCP cez HPNSSH zapnutom bola 16CPU fyzika lepsia ako 8vCPU vo VM o tretinu. Pri 8CPU vs 8vCPU by to bolo tesnejsie!!!



Ovela lepsie je pustit 8 VMs po 1vCPU, je tak dosiahnuta mensia strata oproti fyzike. Pri SCP over HPNSSH je uz strata stvrtina, ale voci 16 CPU fyzickemu stroju !!! Cize viacero VMs robi lepsi prenos ako jedna velka VM!!!!! Niekedy su lepsie 2vCPU vs 1vCPU machines.


VMDq – frontovanie queues zalozene na threads – traffic load distributed preneseny z hypervisor na physical NIC.

Techniky NetQueue a VMDq realne pomahaju len ak je komunikujucich viacero VMs.

Vmdirectpath pre viacero VMs dedikovane na jeden IO hardware – bude podpora v novych sietovych kartach.

Pri teste neboli pouzite Jumbo Frames, zemerali sa na default sietovy setup. Jumbo je zamerany na storage network iSCSI, NFS!!! Toto bol test SCP,RSYNC. Chceli testovat default out of the box. Uvedomuju si, ze na tuning a upravy je potrebne mat ludi. Existuje vela moznosti pre tuning, ale chceli out of the box technologies.
Predpoklad je, ze jumbo frames by mali napomoct k zvyseniu vykonu v rozmedzi 0-20%.

Netperf sa ukazal ako vhodny identifikacny tool. Nieje ale vhodny pre real workloads. Neda sa len pustit benchmark, musi sa testovat realna aplikacia – priklad je SCP, RSYNC cez SSH.
Nastroje by mali utilizovat viacero threads. Dnes bezne pouzivane aplikacie vykazuju velke hodnoty idle,cize nepracuju efektivne, co vychadza z ich historickeho povodu designu.
Pre nizku latenciu je doporucena kabelas Twinax, SFP Twinax alebo optika, pre nizku latenciu.

Resume:
Najdolezitejsim prvkom testovania sietovych technologii je samotna aplikacia, respektive nastroj operacneho systemu. Stratovy vykon virtualizovaneho prostredia je voci fyzickemu vykonu nizky. K minimalizacii rozdielov vykonov medzi fyzickym a virtualnym nasadenim prispievaju technologie na urovni hardware. Zaverom zakaznik jednoznacne doporucuje virtualizovat.

Zdroj: Achieving 10+ Gbps File Transfer Throughput Using Virtualization - End-User Case Study
http://www.vmworld.com/docs/DOC-3820

Pouzivat Hyper Threading alebo nie?

This question is about to come up again so I can see multiple posts coming in the future on it. Intel’s Nehalem Processor is adding HyperThreading back into the chips so you can expect more posts on this topic in the near future. I have not reviewed HT on Nehalem so I don’t know all of the changes that have been made to HT (if any). This is the position I have responded with in the past:

There are pros and cons to using HT in ESX.

Pros

* Better co-scheduling of SMP VM’s
o Hyperthreading provides more CPU contexts and because of this, SMP VM’s can be scheduled to run in scenarios which would not have enough CPU contexts without Hyperthreading.
* Typical applications see performance improvement in the 0-20% range (the same as non-virtualized workloads).

Cons

* Processor resources are shared with Hyperthreading enabled
o Processor resources are shared such as the L2 and L3 caches. This means that the two threads running on the same processor compete for the same resources if they both have high demand for them. This can, in turn, degrade performance.

All things considered, it is difficult to generalize the performance impact of Hyperthreading. It is highly dependant on the workload of the VM.

One additional point is that you can always utilize the CPU min and max values on a per-VM or Resource Pool basis to reserve certain amounts of CPU for your most critical workloads.

As with the majority of performance items I enounter, test, test, test. Try out the workloads and see what works the best on the hardware you have available.

Zdroj: http://vmguy.com/wordpress/index.php/archives/362

10Gb Etherent vo virtualnom pocitaci

Clanok, pojednavajuci o vykone VMware virtualizovaneho sietoveho adaptera vmxnet3, hovori:

Line Rate 10GigE

Howie Xu, Director of R&D for VMkernel IO remarked recently that after talking with a few customers, many are still unaware we can achieve line rate 10GigE performance on ESX 3.5. Read “10Gbps Networking Performance on ESX 3.5u1” posted on VMware’s network technology resources page.

The story only gets better with vSphere 4 and ESX 4 with the new Intel Nehalem processors. Initial tests from engineering show a staggering 30Gbps throughput.

Zdroj: http://www.vadapt.com/2009/05/vmxnet3/

Na druhej strane treba mysliet na to, ze vmxnet3 nieje podporovany pre VMware Fault Tolerance:

http://kb.vmware.com/kb/1013757

VMware FT cannot be enabled on a virtual machine using either the VMXNET3 or PVSCSI devices; vCenter Server will simply report an error that the network interface or disk controller isn’t supported for VMware FT.

Zdroj: http://blog.scottlowe.org/2009/07/05/another-reason-not-to-use-pvscsi-or-vmxnet3/

Prehladny uvod do VDI riesenia VMware View

Prehladny clanok o VDI rieseni VMware View na kvalitnom portale www.brianmadden.com:

Zdroj: http://www.brianmadden.com/blogs/guestbloggers/archive/2009/01/15/an-introduction-to-vmware-view-3-features-and-best-practices-part-1-of-3.aspx

streda 5. mája 2010

Konfiguracia VMDirectPath I/O

Krasna a uzitocna funkcionalita vSphere - VMDirectPath I/O.
Je to moznost exkluzivne pridelit hardware I/O zariadenie k virtualnemu pocitacu.
Pozor, podmienkou je hardware Intel Virtualization Technology for Directed I/O (VT-d) resp. AMD IP Virtualization Technology (IOMMU).

Zdroj: http://www.youtube.com/watch?v=jmQ5Ej8r-aA

Configuration Examples and Troubleshooting for VMDirectPath
http://www.vmware.com/pdf/vsp_4_vmdirectpath_host.pdf

VMware VMDirectPath I/O
http://communities.vmware.com/docs/DOC-11089.pdf;jsessionid=8666C2B4AEFBCA0CE8ED9BF81C0FB70B

streda 28. apríla 2010

RSA Data Loss Prevention v spojeni s VMware a Cisco Ironport

Zaujimave linky ohladne RSA DLP:

Securing Sensitive Information – How MSIT uses ADRMS + RSA DLP
http://edge.technet.com/Media/Securing-Sensitive-Information--How-MSIT-uses-ADRMS--RSA-DLP/

VMware VMworld 2009: EMC RSA DLP integration with VMsafe, vShield Zones and Nexus 1000v Demo
http://www.youtube.com/watch?v=mL9e49MDeOk

VMware VMworld RSA DLP Demo
http://www.youtube.com/watch?v=Iz-m382NYiY

RSA DLP a Cisco IronPort Email Security Appliance
http://www.youtube.com/watch?v=9b3OzBw0jZo

RSA DLP a Cisco IronPort Web Security Appliance
http://www.youtube.com/watch?v=Qw_Fc66Y0AI

Security: Data Loss Prevention
http://www.youtube.com/watch?v=TwLC6aCNA2U