Monday, April 25, 2011

Ehcache bulk operation APIs

People who have used ehcache know that there is only one bulk operation that is provided as of now which is removeAll(). This operation removes all the entries from the cache. In the next release of ehcache there is a plan to provide these operations as well

Collection <Element> getAll(<Object> keys)
void putAll(Collection<Element> elements)
void removeAll(Collection<Object> keys)

The goal of these new APIs is to provide bulk operations which should be faster than normal operations. Now currently we have these two consistency mode for cache operations

1. Strong: All writes are done under a lock, so once anything is changed (add/remove/update) in the cache, rest all nodes and threads will see the change
2. Eventual: As the name suggests, no clustered lock is taken for each operation and eventually the cache would become coherent. This is used when speed, predictability and uptime is of primary concern in an application.

The challenge is to provide the new bulk APIs to do the operations faster than doing single operation in a loop.

Here is what is planned to achieve this.

1. PutAll (eventual consitency)
We create transaction boundaries by taking a lock and releasing it. There are bunch of optimization which is done like transaction folding etc. Since we already know how many entries we need to put under this call, and since all entries can not be put under one transaction, this can be broken and sent in one batch according to "ehcache.incoherent.putsBatchSize" to the server.
Some thing like
lock.takeLock()
for(int i =0; i < batchSize; i++){
doPut(key, value)
}
lock.releaseLock()
notify listeners for the puts that were done in the for loop

2. PutAll (Strong consitency)
In strong consistency for each operation a lock is taken, operation is performed and then the lock is released. For bulk putAll this can be optimized. To avoid dead lock and be efficient, lock request would be sent asynchronously to the server. Every few milliseconds its checked to see how many lock responses have come from the server. Whatever locks have been granted so far, put will be done for those elements and locks will be released. The putAll call we keep trying to get all the locks and eventually put everything in the cache. Listeners will be immediately notified for all the puts which were done after lock has been granted.

3. removeAll (eventual consitency)
Same as putAll (eventual consistency)

4. removeAll (strong consitency)
Same as putAll (strong consistency)

5. getAll
The implementation can be understood by these steps
  1. Collection getAll(Collection) will return a custom collection whose iterator will be overridden
  2. The request to server will be based on CDSMDso which will let us split the request in stripes in case of multiple terracotta server stripes.
  3. In return we will get a map of keys and object ids of the values, which we will use in our custom iterator
  4. When the object ids of the values corresponding to the keys are returned a lookup request will be initiated with a configurable number of object ids batched and the values returned will be added in the local cache
  5. When the collection is iterated the value corresponding to the object id associated with the value of a particular key will be returned if present in the local cache. If not then we need to fetch the value from the server in batch
  6. Implementation of step 5 is little tricky since some of the values which were added in the local cache while looking up objects might get evicted. The strategy for how the next batch of values will be looked up need to be thought of.

For strong consistency case cluster read lock would be acquired for the whole key set as in the case explained in removeAll(strong consistency) case with the difference of having readlock instead.

Performance Comparison
Right now these new APIs are in development phase. I will be updating the result of this exercise after its development and testing.

Thursday, April 21, 2011

ehcache search example

Search was released in ehcache recently and has been getting quite a traction from various users. This small blog is to explain how you can use search in your application with an example application.

1. What is Search
When the search is enabled then while building the cache an index of element is built according to what has been provided as searchable attribute by the user. This knowledge can later be used to execute complex queries in the cache. This cache be standalone ehcache or Terracotta clustered cache. For example a query like this can be executed

2. What can you Search
The search is to get the results out of the Elements of the cache based on keys or values. The criteria upon which search can be done is provided by the user at the time of initialization of cache against which indexing would be done.

3. Enabling Search
Enabling search is fairly easy. All you need to do is to add searchable tag in cache definition section of the ehcache.xml file. Here is an example.

<cache name="cache2" maxElementsInMemory="10000" eternal="true" overflowToDisk="false">
<searchable/>
</cache>

This the simplest way to enable search. This will simply see all the keys and values and check whether they are searchable type and if the are then will add them as search attributes. This will by default start automatic indexing. To disable it you can do this

<cache name="cache3" ...>
<searchable keys="false" values="false">
...
</searchable>
</cache>

When keys or values are not directly searchable then we need to extract searchable attributes out. In that case you can provide the method name which should return a searchable type and can be used for indexing. A typical example is
<cache name="cache3" maxElementsInMemory="10000" eternal="true" overflowToDisk="false">
<searchable>
<searchAttribute name="age" class="net.sf.ehcache.search.TestAttributeExtractor"/>
<searchAttribute name="gender" expression="value.getGender()"/>
</searchable>
</cache>

You can also do this programatically like this

SearchAttribute sa = new SearchAttribute();
sa.setExpression("value.getAge()"); sa.setName("age"); cacheConfig.addSearchAttribute(sa);
4. Search Attribute
The user has to define search attribute either in config file or programatically to enable the indexing and query in the cache. Search attributes are a way to tell the cache what is need to be indexed so that it can be queried later on. Here is how you define a search attribute in the cache

5. Querying the Cache
If the rest of steps you have followed correctly then you are pretty much done and ready to do complex queries from your cache. All you need to do is to crate a query, add specific criteria, add aggregators if you wish to, and execute. Here is an example.

Query query = cache.createQuery().addCriteria(age.eq(35)).includeKeys().end(); Results results = query.execute();
Now you have the result of your specific query. You can do these operations on your result set to server your purpose.

discard() :Discard this query result. This call is not mandatory but is recommended after the caller is done with results. It can allow the cache, which may be distributed, to immediately free any resources associated with this result.


List all() : Retrieve all of the cache results in one shot


List range(int start, int count) : Retrieve a subset of the cache results


int size(): returns size of the result set


boolean hasKeys() : Whether the Results have cache keys included


boolean hasValues() : Whether the Results have cache values included.


boolean hasAttributes() : Whether the Results have cache attributes included.


boolean hasAggregators() : Whether the results contains aggregates



More documentation can be found here

Tuesday, April 12, 2011

Mocking java System class and override System.currentTimeMillis() using JMockit

Recently i came across a very interesting problem while writing a system test for a component. The problem statement was to throw an operator event when the server's and client's system time is out of sync by some seconds. The system test needed to run on one box and still give different time for System.currentTimeMills()

To do this i used JMockit. Its fairly easy to get JMockit in your environment by mvn repo or by adding ivy settings in your project. Here is the repo that you can use to fetch JMockit using mvn

<repositories>
<repository>
<id>download.java.net</id>
<url>http://download.java.net/maven/2</url>
</repository>
</repositories>

This is the library that you need to add as dependency for testing

<dependency>
<groupId>mockit</groupId>
<artifactId>jmockit</artifactId>
<version>0.993</version>
<scope>test</scope>
</dependency>

The other way of getting the jar is through adding this in your ivy.xml file

<dependency name="dspace-jmockit" rev="0.999.4" org="org.dspace.dependencies.jmockit"/>
Now comes the part of how you can override System.currentTimeMills
Below is the way you write the class be used to override the static methods of System

@MockClass(realClass = System.class)

public class MockSystem {

private int i = 0;


@Mock

public long currentTimeMillis() {

i++;

return i * 10000;

}

}


To use this in your test class you need to call this before you start the actual test.

Mockit.setUpMocks(new MockSystem());


Also you have to be careful to call this before your test ends so that you do not screw up any other test


Mockit.tearDownMocks(System.class);


A few things to be noticed here.
1. The overriding of System class static method can only be done if you are using JAVA 1.6. It will fail for lower versions.
2. If you get the following exception in your test then its because you are not initializing JMockit before you overrode its static method. To get rid of this you need to make sure that the Jmockit jar is above the junit jar in the export order of the libraries in your project.


INFO Caused by: java.lang.IllegalStateException: JMockit has not been initialized. Check that your Java 6 VM has been started with the -javaagent:/Users/rsingh/work/branches/enterprise-1/community/code/base/dependencies/lib/dspace-jmockit-0.999.4.jar command line option.

INFO at mockit.internal.startup.AgentInitialization.initializeAccordingToJDKVersion(AgentInitialization.java:44)

INFO at mockit.internal.startup.Startup.verifyInitialization(Startup.java:247)

INFO at mockit.Mockit.(Mockit.java:82)