GigaSpaces at Supercomputing 2008, Nov 17-20, Austin, TX

Posted 10 November 2008 @ 5:45 am by Amnon Raviv

Want better ROI on your HPC investment? Want to move transactional, business-critical applications to your grid and rip the benefits of the “new HPC”?

Come visit GigaSpaces at the Sun booth in SC08, the international conference for high performance computing. Learn how our joint offerings help enterprises of all sizes run more complex simulations, forecasts and models on lower cost hardware, and deliver greater precision and higher quality decisions in near real time.

We will be demoing GigaSpaces eXtreme Application Platform (XAP) integrated with the latest version of Sun’s open source cluster management application, Sun Grid Engine 6.2. See how together we can help you efficiently share expensive compute resources, deliver dynamic provisioning of application workloads, and provide continuous high availability. 

So, make sure to stop by the Sun booth and look for the GigaSpaces folks. We will be at the booth on Monday (Nov 17) between 7pm to 9pm, Tue & Wed between 10am to 2pm. Or come and see our presentation at the Sun Theater on Wed at 4pm. 

See you in Austin!


Read more...

Scalable, Low Latency Web Tier on Amazon EC2

Posted 7 November 2008 @ 4:24 pm by Geva Perry

Shay Hassidim, deputy CTO at GigaSpaces, posted an impressive write-up of a benchmark the team ran on Amazon EC2. What's nice about it is that they took a standard web app, in this case the Spring PetClinic, and dropped it into the GigaSpaces container, achieving instant low-latency and scalability, with out-of-the-box load-balancing and fail-over. Extremely cool.

The other components in the app include standard and open source components: Jetty, MySQL, Apache load-balancer, JMeter and Ant.

Also, Shay posts a screen shot (I think it's the first-ever public one) of the new GigaSpaces cloud framework. Check it out:

See the full benchmark numbers on Shay's post. And you can sign up for a GigaSpaces pay-per-use EC2 license here.


Read more...

Scaling the Web Layer – The Web Container Benchmark

Posted 7 November 2008 @ 1:07 am by Shay Hassidim

We have been working for the past few days on building a benchmark for our new web containers and its built-in integration with the apache load balancer.

The benchmark was running on Amazon EC2/EBS cloud and was deployed using our new web based cloud tools.
The benchmark goal is to measure the scalability of the web layer using an industry-standard web based application - the Pet Clinic.

With this benchmark the data is stored in the GigaSpaces's In-Memory-Data-Grid (IMDG) and persisted into a MySQL database in an asynchronous manner. We used GigaSpaces SLA containers (the "Service Grid") to host the relevant services when running in a cloud environment.

The benchmark uses the following components:
- The Standard Spring based pet clinic application ported to use GigaSpaces DAO instead of Hibernate DAO when accessing the data. See the GigaSpaces ported version here.
- JMeter - the popular benchmark and measurement tool.
- Ant - Ant has a nice support for JMeter.It allows you for example to inject different numbers of users into a JMeter test plan and run it headless. - GigaSpaces XAP 6.6
- Amazon EC2
- Apache 2.2 as a load-balancer running on the cloud
- GigaSpaces Apache 2.2 load-balancer agent running on the cloud
- Jetty web server running as a web container within the GigaSpaces Service Grid.
- GigaSpaces web based cloud tool.
- Sun MySQL database - running on the cloud using Amazon Elastic Block Store (EBS). The database is configured and started automatically once the AMI instance is started.
- GigaSpaces Amazon EC2 large AMI that includes JMeter , ant , XAP 6.6 , apache , mysql , ganglia (and more) pre-configured , integrated and fully tuned.
- Result analyzer application that parses the results created by JMeter and produces a table with the average response time vs. amount of concurrent users.

As you can see from the list above - it is a unique mix of commodity and standard infrastructure running on the cloud and is used by the majority of our customers on EC2. We use JMeter to simulate heavy web-traffic generated by large amount of end-users. The traffic is generated in the same network as the web servers , apache load-balancer and the IMDG.

As you can see from the graph below the response time latency is decreased when having more web servers are running:

Web Layer Scales

Users browsing the Internet are usually tolerant of a latency of between 500-1000 ms between mouse click and page download. From the graph above, if a single user response time latency over the LAN is about 5 ms , it means you could serve ~100 real web users without them to be annoyed with the delay.

Extrapolating these numbers provides us the following graph:

web layer scale2

This means that with 3 web servers sharing the same IMDG you could serve 400 concurrent users with less than 1 sec latency. This assumes the network latency is 200 ms and there is some additional overhead involved when having large amount of users hitting the apache load-balancer , web container and the IMDG concurrently.

We have measured also the IMDG max throughput when having 1 , 2 and 3 web servers. See the results below:

IMDG TP
As we can see the IMDG load is increased when adding more web servers almost in a linear manner. This means the bottleneck is the web layer - when scaling the web layer we provide better latency. We could have used local cache or local view to eliminate the access to the IMDG , but the test was focused with measuring the web layer scalability and not with optimizing the data access layer or caching it within the web-server runtime.

Now, a few words about the user experience and ease-of-use when faced with such a large amount of different components running in this new cloud environment: It was surprisingly simple thanks to the new GigaSpaces Web-based Cloud deployment and management UI.

In order to run the benchmark there is no need to have the applications files on your local machine or be familiar with the EC2 cloud API or management tools. All the application related files have already been placed on the cloud (using Amazon S3 repository). The user is only required to specify some simple configuration choices using an XML document and the GigaSpaces web cloud deployment and management UI is responsible for instantiating the AMIs, starting relevant GigaSpaces containers and deploying the application while controlling the entire life cycle of all the different application components.

See below a screenshot of the new GigaSpaces web UI cloud tools interface:

cloud web ui

Conclusion:

The above proves that scaling a web application composed of multiple components on EC2 is not a "far fetch dream" anymore; it is here today, at your fingertips!

If you would like to run this benchmark yourself , drop us line to: cloud@gigaspaces.com.

Shay


Read more...

HelloWorld — a bonus readme

Posted 4 November 2008 @ 8:29 pm by Owen Taylor

When you download the GigaSpacesXAP product (currently at version 6.6.1) you get a few really nice examples.

With those examples come some very detailed instructions and explanations as to how to use GigaSpaces. These instructions are found in the "docs" subfolder under the example root and are written in rich HTML with beautiful screenshots and graphics to help guide the reader.

Example:

GS_HOME/examples/helloworld/docs

Despite the existence of these terrific resources, for some people who are new to the GigaSpaces experience, it might be helpful to have a set of instructions in the traditional readme.txt file. ( I can be so old fashioned sometimes...)

I created the following in the hopes that it will prove useful and speed up the adoption of this wonderful application platform known as GigaSpacesXAP.

Here goes... (Imagine this readme is located in the root of the helloworld example like this:

GS_HOME/examples/helloworld/readme )

The Owen Taylor supplementary HelloWorld README for the GigaSpacesXAP6.6.1 helloworld example:


Hello and welcome to Space-based Computing!

This example shows the basic interaction between a processing unit (where objects are processed) and a simple client that feeds the processing unit with objects to be processed.

There are two ways to run this example: (provided as scripts in this directory)

The first way is simplest and shows the processing unit running in one process and the Feed running in another. Neither of the processes are managed and no clustering technology is employed to provide fail-over or scaling.

To try out the example in this simple way read and follow the first set of instructions below.

[To try out the more complex topology, read and follow the second set of instructions below.]

_______________________________________
FIRST SET OF INSTRUCTIONS::
_______________________________________

This example does the following:
1) starts the processing unit containing the processor and a space
[this is where the work of processing information goes on]
2) Starts the client-side Feeder
[this populates the system with 1000 objects to be processed]

Instructions:
1) Ensure you have GigaSpacesXAP6.6 or higher
2) Ensure you have JDK1.5
3) Navigate to the directory containing this example readme file

Execute the following
> build.sh dist
It will:

a) Build the application and create the jar file versions of the deployment units

Next Execute:

> ../../bin/puInstance.sh ./processor/pu/hello-processor.jar

This will:

b) Start the space-side system which includes a processor and a space

Edit the build.xml file so that this line:

<target name="run-feeder" depends="dist">

Is changed to equal this:

<target name="run-feeder" depends="">

Next Execute:

> build.sh run-feeder

This will:

c) Start the client-side Feeder which populates the system with 1000 objects


If you want to use a GUI to examine what is running, execute:
>../../bin/gs-ui.sh

If you want to feed more information into the system,
you can execute the

> build.sh run-feeder

more than once.

Note: If you wish to open the three Eclipse projects that make up the source for
this example in Eclipse, be aware that they use a variable GS_HOME that must be
configured in your Eclipse workspace. It should point to the install
directory/folder of GigaSpacesXAP6.5



_______________________________________
SECOND SET OF INSTRUCTIONS::
_______________________________________

This example does the following:

1) Starts the GigaSpacesXAP runtime environment (Service Grid)
2) Deploys the space-side system as a "cluster" which is split into 4 processing units:

2 partitions
(which divide the total work up between them)

2 dedicated backups
(one for each active partition)

3) Starts the client-side Feeder
[this populates the entire system with 1000 objects to be processed]

Instructions:
1) Ensure you have GigaSpacesXAP6.6 or higher
2) Ensure you have JDK1.5
3) Navigate to the directory containing this example readme file

Execute the following:
> build.sh dist
It will:

a) Build the application and create the jar file versions of the deployment units

Next Execute: (add piping to a log file if you like)

> ../../bin/gsc.sh &
> ../../bin/gsc.sh &
> ../../bin/gsm.sh &


This will:

b) Start the service grid which is the Gigaspaces application server runtime environment

Next Execute:

> ../../bin/gs-ui.sh &

This will:
c) Start the gs-ui so you can see when the service grid is started

(Switch to the middle tab on the left-hand side of the GUI called “Deployments,Details” and you should see two boxes in the bottom left of the screen. Those are your empty GSC runtime containers.)


Edit the build.xml file so that this section:

<macrodef name="deploy">
<attribute name="name"/>
<sequential>
<java classname="org.openspaces.pu.container.servicegrid.deploy.Deploy" fork="false">
<classpath refid="all-libs"/>
<arg value="-groups"/>
<arg value="${groups}"/>
<arg value="-timeout"/>
<arg value="15000"/>
<arg value="@{name}"/>
</java>
</sequential>
</macrodef>

Is changed to equal this:

<macrodef name="deploy">
<attribute name="name"/>
<sequential>
<java classname="org.openspaces.pu.container.servicegrid.deploy.Deploy" fork="false">
<classpath refid="all-libs"/>
<arg value="-groups"/>
<arg value="${groups}"/>
<arg value="-timeout"/>
<arg value="90000"/>
<arg value="-locators"/>
<arg value="localhost"/> <!-- assumes GSM is on same host as you-->
<arg value="-cluster"/>
<arg value="schema=partitioned-sync2backup"/>
<arg value="total_members=2,1"/>
<arg value="@{name}"/>
</java>
</sequential>
</macrodef>

This edit changes the topology of the application being deployed so that there will be 2 primary active instances and 2 backup instances deployed into the service grid.

Next Execute:

> build.sh deploy-processor

This will deploy the newly re-defined 2 primary spaces and accompanying workers and their backup service instances into the service grid (you will see them appear in the gs-ui GUI)

Wait for all the nodes to appear. (There should be 2 nodes with a ‘p’ for primary and 2 nodes with a ‘b’ for backup)
This can take a while… (up to 2 minutes the first time depending on the machine and network)

Next Execute:

> build.sh run-feeder

This will:

c) Start the client-side Feeder which populates the system with 1000 objects


If you want to feed more information into the system,
you can execute the

> build.sh run-feeder

more than once.

Note: If you wish to open the three Eclipse projects that make up the source for
this example in Eclipse, be aware that they use a variable GS_HOME that must be
configured in your Eclipse workspace. It should point to the install
directory/folder of GigaSpacesXAP6.6
HTH Owen.


Read more...

Cloud Computing. Literally.

Posted 3 November 2008 @ 3:19 pm by Geva Perry

Last week we made a very exciting announcement about Miwok Airways selecting GigaSpaces as the application server for running their reservation and pricing engine which will run on EC2. This is a great case study for cloud computing.

Miwok_logo For one thing, you have to love the fact that it is cloud computing used for a business that literally runs in the clouds (the actual meteorological kind). Second, it is an on-demand compute infrastructure for a business that has an on-demand business model in the real world. A perfect fit.

There is a great piece in the LA Times that describes Miwok, but let me give you a brief description from the software application angle. 

The idea is that for so-called ultra-short flights (typically, less than 250 miles), as a traveler you have a terrible dilemma: use commercial airlines or drive your car. I don't need to tell you the hassle and costs involved in both options these days.

Miwok overcomes the hassles of these options by providing you with an on-demand "air taxi" service. You book your flight when you need it. So, say, you want to fly from Santa Monica to Orange County or Palm Springs. You go to the Miwok web site and say when and where, you get pricing and you can book the flight on the spot. The flight you are booking is for a private Cirrus SR22. You can park 100 feet from the airplane itself (at a local airport, not just the major ones) and you don't need to go through security (imagine that!). All of this at the same cost of a commercial flight.

Cirrus_sr22

But here's the part I really like:You can connect to other people via Miwok's own social network, or through a Facebook app (and others to come). As the Cirrus can seat 3 passengers, you can split the costs with other passengers who need to make the same trip. So the flight could end up significantly cheaper than a commercial airline.

Think about it: This is the exact opposite pricing model of big airliners, where the more people go on a flight, the price goes up. From a marketing point of view, this has tremendous viral potential.

One of the biggest technology (technology as in software, not aviation) challenges Miwok was facing was developing an extremely sophisticated real-time pricing engine. It needs to take many parameters into account to offer you a price on the spot, including location, path, season, date, time of booking, number of passengers and several other criteria. It needs to be able to grow and shrink on-demand, especially because of the social networking and viral effect.

The architecture Miwok selected uses MySQL and Hibernate for the persistence layer, but the database is not used as the system of record for calculation and reservations. Instead they use GigaSpaces' in-memory data grid, which gives you in-memory speeds and can also grow and shrink dynamically in the EC2 environment. The benefit for Miwok is that having very little advance knowledge on the traffic they will get, and expecting extreme peaks and troughs in activity, they don't need to pre-plan and invest upfront in the infrastructure. They use GigaSpaces and EC2 and will only pay for hardware and software on a per-use basis -- when and if they actually need it.

They also use GigaSpaces XAP (which includes the in-memory data grid) as the container for the business logic, written in Java, and as a bus for integrating the various underlying services involved in generating pricing and booking reservations.

In short, on-demand application scalability for an on-demand air travel service.

Check out Miwok's web site.

Sign up for the GigaSpaces pay-per-use license for Amazon EC2.


Read more...

« Previous PageNext Page »