Persistence on the Cloud – How and Why?

Posted 2 April 2009 @ 11:55 pm by Shay Hassidim

I was reading Todd's post on the highscalability site and Nati's post about Scaling out - MySQL. These posts outline the difference between GigaSpaces approach and database clustering approach. When reading these I have noticed that users might be missing some information about how they can persist their Data while running their applications on the cloud. This post discuss how this should be done and why.

Before getting to the blog post actual topic I'd like to provide some background about GigaSpaces persistence technology and how it has been evolved.

GigaSpaces Historic IMDG Persistence Support - Quick review

Persistence support with GigaSpaces goes back a long time, back to the very first release of the product almost a decade ago. Initially there was a support for storing the IMDG objects into an ODBMS. Every IMDG operation was translated into a database call in a transparent manner, and each IMDG object was stored as an ODBMS object. Later on, we added support for storing IMDG data into an RDBMS, where the mapping definitions were created on the fly based on the application class fields. There was some flexibility to specify Java types mapping to SQL types, but it was very limited. You could not map the IMDG objects into an existing database tables.

This persistence support came into the world long time before Hibernate even existed, and was based on the JDBC Storage Adapter (SA) module. This option still exists in the product, but is rarely used today and we'll be phasing it out in future versions.

The next step for our persistence support came about 5 years ago where the External Data Source (EDS) has been introduced. This is called also sync-persistency mode or Read Through/Write Through.

sync persistnecy

Figure 1 - Synchronous persistence

This feature added the ability to define custom persistence mapping, where the In-Memory Data Grid (IMDG) can store its data into an existing database tables and load it back into the IMDG on initialization. The EDS allows users to have their custom persistency implementation using their own mapping tools, where the most popular option is to use Hibernate EDS implementation that comes out of the box. This allows users to use their existing Hibernate decorations to instruct the Hibernate EDS how to persist the IMDG data and how to load it back from the database upon initialization.

See below more details about the differences Between JDBC SA and the EDS.

About 3 years ago we came up with the latest evolution of the persistence technology: Asynchronous persistence. We called this the Mirror Service (called also Write Behind or Pass - Persistency as a Service).

async persistency

Figure 2 - Asynchronous persistence

With this revolutionary and unique option, IMDG activities are performed without triggering any database operations in real time. There is no delegation of any activity when updating, deleting or removing data from the IMDG (if also running in ALL IN CACHE, there is no database read when performing IMDG queries). The outcome of each transaction is sent to the database as a background activity (in a totally reliable manner), so that database activity is not part of the critical path of the in-memory transaction.

When the IMDG is partitioned, every primary instance of each partition has a dedicated persistence channel with the Mirror service, where the Mirror service persists data in parallel manner. Every X operations or Y milliseconds, a persistence event is triggered within each primary instance of the IMDG, pushing all the operations in one batch to the Mirror Service and from there, using one database transaction, to the database. This means that the transaction boundary is fully preserved between the IMDG and the database - even though data is persisted in an asynchronous manner. Cool!

If there is any problem with the persistence activity - for example, the Mirror Service can't access the database, the Mirror Service is unavailable, or there is some problem with the mapping - the IMDG logs the information and simply retry to persist the operations once the connectivity with the Mirror Service or the database is reestablished.

Using PasS to Persist Data when Running on the Cloud

When running your application on the cloud, it make sense to store all application data in memory and perform the actual persistence in an asynchronous manner. There is no need to take "the risk" of evicting data from the IMDG (running in LRU cache policy), which would require a "Read Through" operation. Here are the main reasons why this is a good deployment design decision when deploying your application on the cloud with GigaSpaces XAP:

- It is inexpensive - You have an unlimited number of machines available on which to run the application's IMDG, and it is relatively inexpensive to run a machine with a large amount of memory - about $20 a day per machine with 15GB RAM, including both GigaSpaces and the cloud provider's fees.
- Mirror service mobility - The Mirror Service should run "close" to the database. In fact it should preferably be collocated with the database on the same machine. Many database JDBC drivers have a special optimizations when the database client process and the database server process are running on the same machine, allowing the Mirror Service to push its transactions into the database in a relatively short time. Since every database also has special optimizations for batch operations, this would be a relatively fast transaction; disk activity will be the major contributor to the latency of this layer of the application. To make this work, if the database machine fails and a new one is started, GigaSpaces XAP relocates the Mirror Service and places it on the new machine allocated to the database server.
- Out-of-the-box self-healing - if a primary instance of the IMDG partition fails and the stand-by backup instance becomes a primary, a new machine will be started automatically, hosting a new backup of the failed partition. Practically, there is no need to relay on the database for recoverability or high-availability. Your application will have at least 2 copies of every object available in memory, no matter what happen (as long as the cloud data centers are up and running).
- You never know the exact latency between the IMDG and the database - you don't want this to impact the application response time when reading data from the database or when writing data into the database. So you should not reply on Write-Through (i.e. sync persistence).
- Your application can scale automatically to have many instances of your business logic services or IMDG. A synchronous persistence mode might generate huge activity on the database, making it a major bottleneck and severely slowing down the application's response time. We don't want this to happen…
- You might need to transform the application objects into a completely different representation within the database. This often happens when the database is used for complex offline reporting or data-mining. This consumes resources and time, and there is no reason why the application should be required to wait for this activity to be completed before committing a transaction and responding to the user.
- Database tables might be upgraded from time to time - In many cases, you can't perform such sensitive activity while the system is alive and constantly interacting with the database server. This means you need to totally de-couple the application from the database activity.
- The database machine might fail and it might take time for it to recover and attach to the relevant EBS volume.
- Snapshotting - the cloud supports the ability to "dump" the machine's file system data into the cloud repository (S3). This is called snapshotting. Essentially, it's a way to archive your file system very quickly. During the short period of time in which a snapshot of the database is being taken, system activity is blocked. This can impact the entire application, if it accesses the database directly on a regular basis (as opposed to having all data in memory and persisting data asynchronously). 

Recommended Deployment Architecture on the Cloud

Taking the above into consideration lead us into a very simple and powerful persistence deployment architecture when deploying applications on the cloud:

1. Run the IMDG using cache policy mode "All in Cache." There is no need to evict objects based on available free memory. You will always have enough RAM to store the entire application data in memory.

2. Run the Mirror Service and allow it to run in the same machine as the database server.

3. Objects you do not want to store in the IMDG forever should have a finite, short lease time. The IMDG will be responsible for clearing these from memory once their lease expires, but it will not remove them from the database. You will be able to access this data when running your offline reporting systems which access the database directly.

app arch

Figure 3 - Application Architecture on the Cloud

The good news is that when running the application on the cloud, you don't really need to configure the application's deployment to have the above architecture. It is provided out-of-the-box, or more accurately, out-of-the-cloud! :) The GigaSpaces Cloud Computing Framework (CCF4XAP) performs all the operations required to deploy your application with this persistence deployment architecture. All you need to do is to enable this option as part of your cloud application deployment configuration. Once you have this option set, let the cloud framework deploy your application and do its magic - it will setup the cloud machines, IMDG, database server, mirror service, snapshotting, and anything else that's required.

For more details about persisting your application data when running on the cloud, check out the persistence section in the CCF4XAP documentation.

Appendix:Differences between the JDBC SA and the EDS

- Data aging (Lease behavior) - with EDS, expired objects with a finite lease are not removed from the database; however with JDBC SA, they are removed from the database. This means that with the EDS you can implement data aging in very simple manner. Just write objects into the IMDG with the relevant time-to-live (lease time) and let the IMDG persist the data into the database and keep it in memory for a specific, known duration of time. This is very common behavior with trading, telecom, datacom and mobile data systems.

- Initial load behavior - by default, with the EDS, every primary instance of the IMDG loads its data from the database (this is called the "initial load phase") and later replicates it to the backup instance. The EDS support parallelizes data loading, which might speed up the initial load phase substantially. You can also control which type of classes of data will be loaded - in fact you can implement your own business logic for the initial loading. The JDBC SA does not support these capabilities. It does have the initial load phase which pre-load data from the database , but you can't control it and both the primary and the backup loading their data from the database from different tables.
- Replication status - when running with EDS, the replication status (redo log) is stored in memory. The JDBC SA stored this data in the database. We found that there is no need to hit the database for each replication event; it impacts performance very much and greatly complicates the recovery process, without real need.
- IMDG Cluster behavior - when running with EDS, you can have the same database used both for primary and backup. This means write activity is done only once. With the JDBC SA, you have data stored in different tables (and possibly a different database server) for the primary and backup. This has a large impact on performance.
- Data Eviction behavior - the IMDG supports two main eviction modes. One model has no eviction (ALL IN CACHE); the other evicts last recently used objects (LRU). In LRU mode, data is evicted based on a maximum number of IMDG objects and also based on free memory. There is very little control to determine which objects will be evicted, due to the nature of the LRU policy. When running in this mode, IMDG queries are translated into database queries. When a matching object is not found in the IMDG , the IMDG automatically generates the relevant database query (via the mapping layer, i.e., a Hibernate query using the specified dialect) and loads relevant data back into the IMDG. When there are several objects found in the database, these are loaded via a pagination mechanism. With EDS you can control this process. The JDBC SA does not allow you to control it and have your own custom query generated.

Shay Hassidim

Read more...

4 Responses to “Persistence on the Cloud – How and Why?”

  1. Heartburn Home Remedy Says:

    The style of writing is very familiar . Did you write guest posts for other bloggers?

  2. Shay Hassidim Says:

    You can contact me directly at:
    shay at gigaspaces.com

    Shay

  3. The GigaSpaces Blog » Blog Archives » Designing a Scalable Twitter Says:

    [...] Persistence on the Cloud – How and Why? [...]

  4. Designing a Scalable Twitter 网站系统架构网摘 - 系统 架构 服务器 优化 网站 Says:

    [...] Persistence on the Cloud – How and Why? [...]

Leave a Reply