Configuring WSO2 BAM 2.4.1 with external Apache Cassandra 2.x cluster


WSO2 Business Activity Monitor (BAM) writes incoming data(events) into Cassandra. Reason why Cassandra came into play is it’s higher write throughput, linear scalability and eventual consistency.
By default, BAM ships with an embedded Cassandra instance which starts and stops with BAM. But when running on a production environment, single Cassandra instance will not be capable of handling higher data load. So BAM can be configured to work with an external Cassandra cluster which is highly available and scalable.

Instructions for setting up a multi node Cassandra cluster can be found at [1].
WSO2 BAM 2.4.1 can be downloaded from [2]

Purpose of this post is to demonstrate how WSO2 BAM 2.4.1 instance can be configured to work with a cluster of three Apache Cassandra 2.x nodes.
Note: Same instructions applies when configuring BAM with Apache Cassandra 1.x

Let’s assume hostnames and ports of the Cassandra ring are as follows.
- 192.168.0.1 on port 9160
- 192.168.0.2 on port 9160
- 192.168.0.3 on port 9160

BAM_241_Cassandra

Step 1: Edit <BAM installation folder>/repository/conf/etc/hector-config.xml

This file configures the nodes which belongs to Cassandra cluster/ring. Incoming event data will be written to these nodes.
Edit hector-config.xml to include the nodes of our Cassandra ring as follows.

<HectorConfiguration>
 <Cluster>
   <Name>ClusterOne</Name>

   <Nodes>192.168.0.1:9160,192.168.0.2:9160,192.168.0.3:9160</Nodes>

   <AutoDiscovery disable="false" delay="1000"/>
 </Cluster>
</HectorConfiguration>

Note: For BAM versions before 2.4.1, this configuration lies inside   <BAM installation folder>/repository/conf/etc/cassandra-component.xml

Step 2:  Edit <BAM installation folder>/repository/conf/datasources/bam-datasources.xml

This file configures the Cassandra datasources. Simply put, nodes where BAM reads data from Cassandra.

WSO2BAM_CASSANDRA_DATASOURCE configuration.

Make sure to put correct Cassandra credentials for <username> and <passwords>.

<datasource>
 <name>WSO2BAM_CASSANDRA_DATASOURCE</name>
 <description>The datasource used for Cassandra data</description>
 <definition type="RDBMS">
 <configuration>
 <url>jdbc:cassandra://192.168.0.1:9160/EVENT_KS,jdbc:cassandra://192.168.0.2:9160/EVENT_KS,jdbc:cassandra://192.168.0.3:9160/EVENT_KS</url>
 <username>admin</username>
 <password>admin</password>
 </configuration>
 </definition>
 </datasource>

WSO2BAM_UTIL_DATASOURCE configuration.

Make sure to put correct Cassandra credentials for <username> and <passwords>. Most importantly set externalCassandra flag to true.

<datasource>
 <name>WSO2BAM_UTIL_DATASOURCE</name>
 <description>The datasource used for BAM utilities, such as message store etc..</description>
 <definition type="RDBMS">
 <configuration>
 <url>jdbc:cassandra://192.168.0.1:9160/EVENT_KS,jdbc:cassandra://192.168.0.2:9160/EVENT_KS,jdbc:cassandra://192.168.0.3:9160/EVENT_KS</url>
 <username>admin</username>
 <password>admin</password>
 <dataSourceProps>
 <property name="externalCassandra">true</property>
 </dataSourceProps>
 </configuration>
 </definition>
 </datasource>

Note: For BAM versions before 2.4.1, these configurations can be found inside   <BAM installation folder>/repository/conf/datasources/master-datasources.xml

Step 3:  Edit <BAM installation folder>/repository/conf/advanced/streamdefn.xml

This file configures the replication factor and read/write consistency levels for data receivers when writing data to Cassandra.

Edit <ReplicationFactor>, <ReadConsistencyLevel> and <WriteConsistencyLevel> as follows.

<StreamDefinition>
 <NodeId>1</NodeId>
<keySpaceName>EVENT_KS</keySpaceName>
<eventIndexKeySpaceName>EVENT_INDEX_KS</eventIndexKeySpaceName>
<ReplicationFactor>3</ReplicationFactor>
<ReadConsistencyLevel>QUORUM</ReadConsistencyLevel>
<WriteConsistencyLevel>ONE</WriteConsistencyLevel>
<StrategyClass>org.apache.cassandra.locator.SimpleStrategy</StrategyClass></StreamDefinition>

 

That’s it. Now your BAM instance is all set to go. Once it receives data, they will be written into the external Cassandra ring.

[1] http://www.datastax.com/documentation/cassandra/2.0/cassandra/initialize/initializeSingleDS.html

[2] http://wso2.com/products/business-activity-monitor/

Remembering Sam Berns and his philosophy on a happy life


Sam Berns’s life was improbable. He was born with progeria, the genetic disorder that results in rapid, premature aging in children. It’s very rare that only about 250 children worldwide have been discovered yet. Doctors predicted that he will live up to 13. But proving them wrong, he died last week at the age of 17.

Sam Berns, at age 17, stands in front of some of his Lego creations. Photo courtesy HBO

Today I found a TED talk that was given by Sam. In brief, it was about his philosophy on a happy life. The way he embraces the life.

Here is the summary of his talk.

1. Be OK with what you ultimately can’t do, because there is so much you CAN do.
2. Surround yourself wuth people you want to be around
3. Keep moving forward

Judging his philosophy is up to you. But I do realize that he lived his life to the fullest by knowing that he would die soon. Even though his life was short as 17 years, I’m sure he enjoyed every bit of it. Having said that we can’t just frame happyness to be look like a defined model. It’s subjective.

You don’t need to be rich to be happy. Just look at Sam if you need any example.

Below is the TEDx talk I mentioned above. R.I.P big man! I learned something valueble from you.

My new year resolutions for 2014


Year 2013 is gone already. There were great achievements, wins , loses , unfinished businesses and above all, I have wasted a lot of time. I haven’t lived my to the fullest during 2013. Now it’s time to plan ahead and move forward. 

So I came up with my new year resolutions to make things better for this year.

1. Blog more often

I started this blog in 2009. By that time, I was a student. I had plenty of time to read about technical stuff, do some experiments and share them with public. But life got heavier after I got a job. Product releases, targets and ladder climbing  kept life busy all the time. At the start, I intended to keep this blog as my technical journal. But I’ll be posting more life experience lessons in future with high granualarity.

2. Become a morning person.

Studies have proved that early risers are more productive than  night owls. You can find a good infogrphic here. I used to be a morning person in the past. When I was studying I used to get up around 4.30 in the morning. But now I have become a night owl who sits in front of a computer till midnight passes. This might be subjective, but I have felt that staying late night is less productive than rising early. Not only it burns your midnight oil but it spoils precious early hours in the next day. I need to get out from this habbit immediately. So I set a target to get up at 5.00 AM and gain more power to carry out the day.

3. Stay focus on my health.

During last year I have gained too much weight. Last December I exceeded 80kg. Although I’m inside BMI safezone, but I can’t let this continue anymore. I was addicted to Coke and chocolate milk which deposit hell of a lot sugar in my blood. So I’ll be getting rid of them in this year. Among bad stuff, I managed to grow some good health habits from last year.

  • Started to consume fresh milk instead of milk powder (This bit expensive. But I’ll be continuing this in favor of local dairy farmers).
  • Started to consume traditional rice. (Farmers have resurrected several rice species that existed in ancient Sri Lanka. They are far more nutritious than imported rice)
  • Playing Badminton with family members in weekends. 

I’m not fan of Gym. But I thought of getting a Gym membership since my workplace got a sexy gym.  Hardest thing with gym is I’ll have to maintain a controlled diet which will give me negative thoughts about gyms.

4. More involvement in farming

Last year my dad started an organic vegetable garden in our house. He was a retiree and spent lot of time mainting his home garden. Harvest was good enough and it inspired me to start my own home garden. So currently I’m in the process of transforming a bare land into  an organic vegetable farm. I’ll dedicate another post to tell about this effort. 

Image

5. Read more about data science and machine learning.

Number crunching always facinates me. I love to tell stories by looking at numbers. During past few years, Data Science grew restless beyond its limits and I wish to hop into that bandwagon before it’s too late. So I’ll be developing my skills on applied statistics, machine learning and predictive analaytics. Cousera and Udacity will be  more helpful.

6. Improve my reading habbit.

I was an avid read at school. But when I got into IT industry, my reading reading was limited to technical stuff. Mostly I was follwing some technical blogs watching movies all the time. I really need to grow my reading habbit again. But this time I wont be reading only techy stuff, but some books on personal development, investing and some spiritual matters. Reading on my iPad mini gives is more comfortable than reading on paper. Not only this makes my reading list portable, but it enables me to listen to audio books, TED talks and watch videos on Youtube.

I have already picked several books for 2014

Image

7. Good bye consumerism! 

Nothing to say much about this. Simply put, I need to live below means.

Thats a pretty lengthy list. But atleast it has been written in somewhere else. Whenever I feel like I’m taking a detour I can come back here and align my course.

Wish you all a happy and productive 2014!

iPad mini iOS 7.0.2 update introduces cellular data connection issue


Recently I updated the OS version of my iPad mini to iOS 7. But after update, I tried to connect to internet using my celluar data connection. But I could’nt. I tried after changing the APN settings. But it failed that time as well and kept displaying “No Service” at the celluar data status bar on the home screen.

Later I googled a bit about this and found out that I was not the only one who faced this issue. I manage to fix the problem by resetting celluar data settings to defaults.

Here is the flow of actions I followed.
1. Go to Settings -> Cellular Data and press ‘Reset Settings’
2. Turn off your device.
3. Turn it on and try to connect celluar data again.

This time, you must be able to connect!

By the way, I’ve been updated to iOS 7.0.2.

Some usefull links:

http://www.digitaltrends.com/mobile/ios-7-problems/

http://www.wired.com/gadgetlab/2013/10/iphone-5s-ios-7-issues/

A brief introduction to Apache Mahout


What is Mahout?

Mahout is an open source machine learning library from Apache. It has a collection of algorithms that falls under machine learning or collective intelligence category. Mahout is just a Java library, its not a server or tool that gives you a GUI interface.
Nice thing Mahout is its scalable. If amount of data to be processesed is very large or too large to fit in a single machine, Mahout is the best solution. Most of the Mahout’s algorithms have been implemeneted in a way that they could be run on top of Apache Hadoop.

Mahout’s capabilities

Mahout consists of set of algorithms which are capable of performing following machine learning tasks.

1. Collaborative filtering
2. Clustering
3. Classification

Collaborative filtering with Mahout

Mahout’s main focus is to provide collaborative filterting or recommendation services. Given a set of users and items, Mahout can provide recommendations to the current user of the system. It requires user preferences in the form of where preference is a scalar value about user’s taste on particular item.

Mahout supports three ways of generating recommendations

  • User-based: Recommend items by finding similar users. This is often harder to scale because of the dynamic nature of users.
  • Item-based: Calculate similarity between items and make recommendations. Items usually don’t change much, so this often can be computed offline.
  • Slope-One: A very fast and simple item-based recommendation approach applicable when users have given ratings (and not just boolean preferences).

Mahout in production environment

Mahout can be operated under two modes, local mode and distributed mode.

Local mode

In local mode, it can process recommendations from up to 100 million records at real time. However this requires a machine with 4GB of Java heap size. Mahout provides a servlet so that it could be deployed inside a servlet container. By this way, Mahout’s capabilities are not limited to Java platform.

Distributed mode

If you expect to go beyond 100 million records, better to consider distributed mode which can be run as a map reduce job on a Hadoop cluster. Running on distributed mode is a batch process and it takes time.

Websites for FREE online education


Recently I’ve discovered dozens of nice websites for online learning. Nice thing with them is most of the time they are for FREE!

Last week I thought of brushing up my knowledge on Statistics. Although I had a great experience with Udacity (www.udacity.com), I kept exploring all possibilities with Google.

Here are my list of FREE online learning websites.