GigaSpaces’ upcoming cloud framework
Posted 5 September 2008 @ 10:43 pm by Dekel TankelRecently I had the pleasure of doing a demo to one of our partners using the new “cloud framework”.
This framework brings forth 3 key concepts, which we think are essential to successfully develop, test and deploy any scalable transaction applications on the cloud:
“Desktop on the cloud”
View the cloud environment, as if it was your own desktop. You can develop and unit-test using any familiar IDE and then deploy it to the cloud for testing with zero code-changes. In addition the distributed environment in which you deploy and run your code does not affect your business logic. At GigaSpaces, we call it “write once, scale anywhere”
“One-Click cloud”
Once your application is unit-tested and running on your desktop, you can seamlessly transition and scale it on a large number of Amazon Machine Images (AMIs) without any configuration, administration or application changes. This is a key concept behind the idea of “pay as you go” in the cloud. As I wrote in a previous post it’s great that Amazon charges $0.40 an hour for a large machine, but if it takes 4 weeks to deploy my application to this environment, it is no longer that attractive. This framework really set the above as a practical reality.
Scale on demand
I have discussed the missing link between on-demand Hardware (like EC2) and on-demand scalability in a recent post. In this demo I deploy a market-data application with one click, then with another click from my laptop, I add more AMIs and the application simply scales! (by increasing the amount of stock symbols being processed..)
I have to remove my hat (I bought one especially for that…) to our amazing development team – with our new cloud framework, you can take your “mini application” (we call it "processing unit") and literally deploy it to any number of AMIs in one click – behind the scenes, the framework launches the AMIs, start the GigaSpaces containers (GSCs) and deploys your code to the Gigaspaces-EC2 cluster.
It even starts the GigaSpaces Graphical Management Center in the cloud and automatically connect your desktop to it from any internet browser.
All of this magic requires one simple “unzip” to a few Megs of the cloud package and an AWS user credentials. All the GigaSpaces runtime is pre-configured on our AMIs.
The above is the slide I’m using for the demo, which I hope, help clarify what I wrote... a picture worth a thousand words, right?
This cool staff is already available as a private beta and until our formal release we plan other exciting features like support for the recently announced Elastic Block Storage (EBS) using our Persistency as a Service model, support for MySQL, monitoring the AMIs status and more...
If you want access to the private beta, please contact us at ec2@gigaspaces.com
Cheers
Dekel
Read more...
GigaSpaces & Maven Setup : a mini-recipe
Posted 28 August 2008 @ 10:28 am by Owen Taylor
When trying out the bundled Maven tool that now ships with GigaSpaces, I found it friendly to my sensibilities to create a couple of new scripts:
I placed both of these in the GS_HOME/tools/maven directory:
- The first one sets up the necessary paths before calling the installmavenrep.bat that comes with GigaSpaces:
mavensetup.cmd
set M2_HOME=%~dp0apache-maven-2.0.9
set PATH=%PATH%;%M2_HOME%\bin
call installmavenrep.bat
- The second one starts up a new shell with the paths set so that maven commands will function as expected by the GigaSpaces documentation:
mavenshell.cmd
set M2_HOME=%~dp0apache-maven-2.0.9
set GS_HOME=%~dp0..\..
set PATH=%PATH%;%M2_HOME%\bin;%GS_HOME%\bin
start "MAVEN SHELL"
Once in my “MAVEN SHELL”, I can create a new application that uses the mirror service by executing:
C:\gigaspaces-xap-6.6.0-m3\tools\maven>mvn os:create dashDgroupId=com.test.mirror dash
DartifactId=MyFirstMirrorApp dashDtemplate=basic-async-persistency
With the project created – and quickly too! I now add another script to the mix:
- This script lives in the root directory of the newly generated project.
In this example that would be the MyFirstMirrorApp directory found on my machine here:
C:\gigaspaces-xap-6.6.0-m3\tools\maven\MyFirstMirrorApp>
- The purpose of this script is to start up each of the aspects of the project in the proper order and demonstrate that it all works:
(Note: I execute this from within my “MAVEN SHELL” after CD‘ing into the MyFirstMirrorApp directory)
runmyproject.cmd
rem note you must setup mvn before trying to run this project in this way...
start mvn compile os:run dashDmodule=mirror
echo sleep
ping 1.1.1.1 -n 1 -w 60000 > NUL
start mvn compile os:run -Dcluster="total_members=2,1" -Dmodule=processor
echo sleep
ping 1.1.1.1 -n 1 -w 60000 > NUL
start mvn compile os:run -Dmodule=feeder
Once everything is running, I can start the GigaSpaces management User Interface by executing: the following from within my “MAVEN SHELL“
>start gs-ui
Now comes the time for the final script of the day:
- I store this script in the tools/maven directory.
- This is once again executed from within the same “MAVEN SHELL” that I have been using up until now.
- This script starts up the HSQL Viewer and connects to the database being used by the mirror service to store data:
lookatmydb.cmd
mvn os:hsql-ui
dashDdriver=org.hsqldb.jdbcDriver
dashDurl=jdbc:hsqldb:hsql://localhost/testDB
executing the following query while in that viewer shows the data stored there :
select * from PUBLIC.DATA
Read more...
Edge High Performance Computing
Posted 27 August 2008 @ 3:57 pm by Owen Taylor
I ran across this excellent article on Edge HPC and by golly, it sounds like the need for both of the following is growing at a huge rate:
- Increasing the efficiency of the software solutions made to solve business problems
- Providing adaptability of software systems to changes such as increased load and failures
They call this new landscape: “Edge High Performance Computing“, as opposed to “High Performance Computing“
To my mind the major difference here between these classes of problems is the need to maintain state and the capacity to leverage less-exotic hardware and more “modern” and likely Object-Oriented software languages such as .NET, C++ and Java.
I like very much the quality of the content of the Tabor research work and look forward to future updates from them regarding these topics.
Owen.
Read more...
Amazon found every 100ms of latency cost them 1% in sales.
Posted 27 August 2008 @ 8:41 am by Jim LiddleThe GigaSpaces CTO, Nati Shalom, dropped me a line recently pointing out a recent article on High Scalability, which is very interesting. There is an excerpt from the article below:
Latency matters. Amazon found every 100ms of latency cost them 1% in sales. Google found an extra .5 seconds in search page generation time dropped traffic by 20%. A broker could lose $4 million in revenues per millisecond if their electronic trading platform is 5 milliseconds behind the competition.
The Amazon results were reported by Greg Linden in his presentation Make Data Useful. In one of Greg’s slides Google VP Marissa Mayer, in reference to the Google results, is quoted as saying “Users really respond to speed.” And everyone wants responsive users. Ka-ching! People hate waiting and they’re repulsed by seemingly small delays.
The less interactive a site becomes the more likely users are to click away and do something else. Latency is the mother of interactivity. Though it’s possible through various UI techniques to make pages subjectively feel faster, slow sites generally lead to higher customer defection rates, which lead to lower conversation rates, which results in lower sales. Yet for some reason latency isn’t a topic talked a lot about for web apps. We talk a lot about about building high-capacity sites, but very little about how to build low-latency sites. We apparently do so at the expense of our immortal bottom line.
I wondered if latency went to zero if sales would be infinite? But alas, as Dan Pritchett says, Latency Exists, Cope!. So we can’t hide the “latency problem” by appointing a Latency Czar to conduct a nice little war on latency. Instead, we need to learn how to minimize and manage latency. It turns out a lot of problems are better solved that way.
How do we recover that which is most meaningful–sales–and build low-latency systems?
Well, the answer to that questions is “you choose product that are built to handle your latency requirements whilst still allowing you to support scale”. Again it’s clear that Tier Based Architectures with their mish-mash of separate cluster implementations are not only state bound at each tier but are also complex to manage and maintain. GigaSpaces has been talking about these things for a long time now and it’s good to see that there are more general debates and hard evidence of the affects of not building your systems this way.
Read more...
Why pure caching or compute grids are not enough
Posted 24 August 2008 @ 6:22 am by Nati ShalomI came across an interesting comment in our forum from one of our users:
I'm doing a PoC for use GigaSpaces in our applications, to have one complete solution, instead of using other distributed cache & computing. Also I'm hoping to use it to replace our relational DB (which mostly host tables that converted to Objects).
The reason why this comment <!-- /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin:0in; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} span.EmailStyle15 {mso-style-type:personal; mso-style-noshow:yes; mso-ansi-font-size:10.0pt; mso-bidi-font-size:10.0pt; font-family:Arial; mso-ascii-font-family:Arial; mso-hansi-font-family:Arial; mso-bidi-font-family:Arial; color:windowtext;} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.Section1 {page:Section1;} --> caught my attention is because this fellow clearly understands the difference between having to integrate three different products and having an end-to-end solution.
This understanding is aligned with studies we have conducted recently in which we measured the value of adding a caching layer to a JBoss application server and measured the end-to-end latency and throughput improvements. What we found was that the fact that we reduced the access time to the database with a cache didn't significantly improve the end-to-end throughput because we there was another bottleneck at the JMS layer.
This behavior is not related to any particular caching implementation. In fact we witnessed similar behavior with our own caching implementation. It was only when we integrated our messaging and caching that we started to see a meaningful impact on overall throughput and latency (see a more detailed analysis here and here). The same applies to parallel processing. What's the point of parallelizing your execution if at the end of the day all those parallel processes are going to hit a centralized database?
It's true that if you invest enough effort there are some options (for example, integrating caching with messaging or with compute grids) and compromises (mostly around transaction integrity and end-to-end reliability) that will enable you to tune different solutions to provide reasonable behavior and response times. However the question that I would ask is why would you go through all that effort yourself?
Read more...








