Archive for the 'Java' Category
SyncRemoting cookbook
June 4th, 2008I wrote up a cheat sheet for myself when building scatter-gather or map reduce apps with GigaSpacesXAP.
It goes like this:
Create business interface (as part of shared classes)
Space-side stuff:
Create the Space-side implementation (logic) [ This may use a GigaSpace reference ]
Configure the implementation on the Space-Side:
Declare the business implementation as a [bean]
Declare a filter as part of the space
The filter references a ServiceExporter
Declare the os-remoting:service-exporter
The exporter references the business implementation [the bean]
If impl uses space: Declare the giga-space-context
Configure the implementation on the Space-Side:
Declare the business implementation as a [bean]
Declare a filter as part of the space
The filter references a ServiceExporter
Declare the os-remoting:service-exporter
The exporter references the business implementation [the bean]
If impl uses space: Declare the giga-space-context
Client-side stuff
Declare os-remoting:sync-proxy with its attributes:
The business interface it implements
The giga-space it uses as a transport layer
Whether broadcast is true or false
If necessary –declare the reducer for the proxy
Inject the client with the service proxy (ie: public void setMyService(MyService ref){} )
Let’s say we want to do a Map reduce exercise where each node in the cluster will process its share in parallel and return a total result for my digestion…
Let’s say I want to know the average price for all my widgets in all my inventory, but my inventory is scattered across a dozen servers because I do so much business…
I have my widgets:
package com.test.common;
import com.gigaspaces.annotation.pojo.SpaceRouting;
public class Widget {
private Double price;
private String description;
private Integer routingValue;
public Double getPrice() {
return price;
}
public void setPrice(Double price) {
this.price = price;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
@SpaceRouting
public Integer getRoutingValue() {
return routingValue;
}
public void setRoutingValue(Integer routingValue) {
this.routingValue = routingValue;
}
}
My Service needs an interface such as:
package com.test.common;
public interface PricingService{
public Double getAveragePriceOfAllWidgets();
}
Next, I need to implement the service:
package com.test;
import org.openspaces.core.GigaSpace;
import org.openspaces.core.context.GigaSpaceContext;
import com.j_spaces.core.client.SQLQuery;
import com.test.common.PricingService;
import com.test.common.Widget;
public class PricingServiceImpl implements PricingService{
@GigaSpaceContext
GigaSpace space;
public Double getAveragePriceOfAllWidgets() {
Object[] allWidgets = space.readMultiple(new Widget(), Integer.MAX_VALUE);
double priceTotal = 0.0d;
for(int x=0;x<allWidgets.length;x++){
priceTotal=priceTotal+((Widget)allWidgets[x]).getPrice().doubleValue();
}
double averagePrice = priceTotal/allWidgets.length;
return new Double(averagePrice);
}
}
Now I need a client that uses this service:
package com.test;
import java.util.TimerTask;
import com.test.common.PricingService;
public class WidgetSurfingClient extends TimerTask{
private PricingService service;
public void setPricingService(PricingService ref){
this.service=ref;
}
public void run(){
System.out.println(
”The average price of all widgets is now: $”+
service.getAveragePriceOfAllWidgets());
}
}
Now I need something to do the “reduce” side of the map-reduce: (it also performs the final average calculation because I am looking for the average price…
package com.test;
import org.openspaces.remoting.RemoteResultReducer;
import org.openspaces.remoting.SpaceRemotingInvocation;
import org.openspaces.remoting.SpaceRemotingResult;
public class AveragingDoubleResultReducer implements RemoteResultReducer<Double,Double>{
public Double reduce(SpaceRemotingResult<Double>[] results,
SpaceRemotingInvocation remotingInvocation) throws Exception {
double totalDouble = 0.0d;
int size = results.length;
for (int x=0;x<size;x++){
if(results[x]!=null){
if(results[x].getResult()!=null){
totalDouble += (Double) results[x].getResult();
}
}
}
double averageDouble=totalDouble/size;
return new Double(averageDouble);
}
}
That takes care of the Java code, now we have to configure this system:
First: the Space-side configuration:
<beans … XML namespace declaration stuff not shown…>
<!– bootstrap the construction of the serviceCluster: –>
<os-core:space id=”widgetTransport” url=”/./widgetsystem”>
<!– setup the PricingServiceExporter to handle calls –>
<os-core:filter-provider ref=”PricingServiceExporter”/>
</os-core:space>
<!– wrap the serviceCluster in Spring-ready OpenSpaces API –>
<os-core:giga-space id=”gigaspace” space=”widgetTransport”
tx-manager=”transactionManager” />
<!– declare the transaction manager –>
<os-core:local-tx-manager id=”transactionManager”
space=”widgetTransport”/>
<bean id=”PricingService” class=”com.test.PricingServiceImpl”/>
<!– provide a gigaspace context to the pricing service –>
<os-core:giga-space-context/>
<!–
Provide the binding to the Space-side
transport for the PricingService :
–>
<os-remoting:service-exporter id=”PricingServiceExporter”>
<os-remoting:service ref=”PricingService”/>
</os-remoting:service-exporter>
<!– configure a big or small cluster
(here we have 20 partitions with 1 backup for each) –>
<os-sla:sla cluster-schema=”partitioned-sync2backup”
number-of-instances=”20″ number-of-backups=”1″
max-instances-per-vm=”1″>
</os-sla:sla>
</beans>
Next: the client-side configuration:
<beans … XML namespace declaration stuff not shown…>
<!– bootstrap the basic proxy for the serviceCluster: –>
<os-core:space id=”widgetTransport” url=”jini://*/*/widgetsystem” />
<!– wrap the proxy in Spring-ready OpenSpaces API –>
<os-core:giga-space id=”gigaspace” space=”widgetTransport” />
<!– WidgetSurfer Service –>
<bean id=”WidgetSurfer” class=”com.test.WidgetSurfingClient”>
<property name=”pricingService” ref=”PricingService”/>
</bean>
<bean id=”AveragingDoubleResultReducer” class=”com.test.AveragingDoubleResultReducer”/>
<os-remoting:sync-proxy id=”PricingService”
giga-space=”gigaspace”
broadcast=”true”
interface=”com.test.common.PricingService”>
<os-remoting:result-reducer ref=”AveragingDoubleResultReducer”/>
</os-remoting:sync-proxy>
<!– this is regular Spring TimerTask configuration stuff: –>
<bean id=”widgetClientTask” class=”org.springframework.scheduling.timer.ScheduledTimerTask” >
<!– wait 1 seconds before starting repeated execution –>
<property name=”delay” value=”1000″ />
<!– run every 5 seconds –>
<property name=”period” value=”5000″ />
<property name=”timerTask” ref=”WidgetSurfer” />
</bean>
<bean id=”widgetClientTimer” class=”org.springframework.scheduling.timer.TimerFactoryBean”>
<property name=”scheduledTimerTasks”>
<list>
<!– This wires the factory to the scheduledTask bean above –>
<ref bean=”widgetClientTask” />
</list>
</property>
</bean>
</beans>
Cheers,
Owen.
Integrating GigaSpaces persistency service into an existing tier based system
April 23rd, 2008A common issue I’m facing recently is how to integrate existing tier based applications with GigaSpaces persistency service, AKA persistency as a service (Paas) or mirror . The motivation is often a result of the acknowledgment that a standard tier based application fails to scale when facing the database throughput limitation.
Software Caching technologies (overlooking their [...]
Cloud Aware Classpath
March 12th, 2008One of the most challenging aspects in deploying large distributed systems is making sure the right jar files are located where they are needed. For a cluster that contains many nodes (tens and hudreds and even thousands) having the ability to deploy the application code easily is very important.
This problem is not new and indeed there have beens several solutions for it in the past:
The first one is of shared file-system. All hosts have access to a shared drive. A classppath variable is set to this shared drive and upon change this shared drive is the only place to put new jar files. This simple solution is ok when it comes to static deployments and when we know all hosts are on the same physical LAN. Another drawback of this simple solution is that it represents a SPOF in the system. Also, the singularity imposes a contention point that in many cases becomes a bottleneck. Also, it is imperative to make sure the FS is as available as our SLA requirements. This may become very costly.
The second approach is of a codebase server. This is the approach used by RMI and by JINI. As java class loader is a URLClassLoader, all we make sure is that our jars are behind a reliable HTTP server and when the JVM starts it looks for the classes at the codebase server as well. The codebase server is a first step in virtualization. A url is abstracted by a host name controlled by a DNS and can be changed without impacting the application. There are few implementation problems with this approach: The first is that when it comes to downloading the jars, in real life, this may lead to a network storm which will result in system hickups as jars are downloded fully and in real life systems those jars may be very large. Another problem with the codebase server approach, is that the URLClassLoader behavior is static. Once a class has been loaded into the class loader it can not be changed, and in order to replace it the class loader needs to be destroted, which practically means in many cases, the process needs to be restarted.
What we’ve introducing in 6.5 and is already available in M7, is what we call "Cloud Aware ClassLoader". Once a VM in the cloud requires to load a class that it does not have it’s class defition, it calls to the "cloud" classloader to retreive the class definition. This mechanism is fully integrated with the JVM class loader, in many cases the sequence of loading a class is transitive. For example if class A implements interface I, the classloader starts loading class A and as it idetifies other class dependecies such as interface I are missing, it looks for it in the cloud for the interface’s class defintion.
The full power of this feature unlocks dynamic capabilities of patterns such as MapReduce and Master Worker. Where is such cases the implementation of the Work object (M/W) can be changed at runtime while the system is running without the need to re-deploy the code on all nodes again.
Cosider the following scenario. Task interface is defined with one execute mehtod. Worker implementations resides on cloud nodes know how to consume a Task object and to call it’s execute method. The first client connects to the cloud and submit an implementation of Task which is called TaskA. All he’s doing is connecting to the cloud and submitting his task. No need for task deployment on cloud nodes is required. A second client connects to the cloud as well and submits his own implementation of Task named TaskBTask and he’s not deploying as well. The beauty of it is that if client A would like to be able to use TaskB the class information of TaskB will be shipped automatically to his JVM upon demand without any configuration.
The funny part, is that we didn’t plan in advance to solve this problem in the release, we had other issues to attend. However, as we had several painful experiences in deployment of huge clouds in our lab and on EC2, the Runtime Platform team, decided that they are too spoiled to work with non-elegant solutions, and in few days work, put this great feature into the product.
You’re all encourage to give it a spin and share you’re thoughts on this with us.
Cheers,
Guy
.Net Customer Announcement: Susquehanna (SIG)
March 11th, 2008In the past few months we've made several exciting announcements, such as our partnership with SpringSource , the expansion of our executive team, the launch of our community site OpenSpaces.org , the OpenSpaces Developer Challenge and the Start-Up Program . But there is nothing like a customer announcement, as in today's press release [...]
Why not use maven?
January 16th, 2008I received an (anonymous) comment regarding the command-line project creator.
(anonymous)asks:
“
why dont you use Maven archetypes for this ? See http://maven.apache.org/guides/introduction/introduction-to-archetypes.html
“
I answer:
First of all, if you like Maven I have good news!
The official GigaSpacesXAP product will support Maven and supply some archetypes very soon. (I believe the next early-access build will contain some support for this)
Now, you may, like Charles Miller , not like Maven.
The command-line project-creator is designed to be extremely light-weight and simple for anyone to use.
To use it, you need the jar file ~80kb and a probably two or three batch or shell scripts.
[Once I write some decent documentation, it will be more obvious what is required to execute the functionality.]
While it does create eclipse files, it also creates a build.xml suitable for execution from any ide or even emacs.
That said, once the maven stuff is supported and available, I plan to contribute my service templates to the available archetypes where they can add value.
I do not really see these as competing, but rather co-existing, like items on a menu: some people prefer fish, some meat, both have to eat.
I hope this helps to explain the existence of CPC
BTW: from now on, I will try to have news regarding CPC posted at openspaces and hope further comments can also live there, just for ease of information access and organization.
Cheers,
Owen.







