Hi... Here I am talking about some most usefull event processing tools basically log processing tools from distribute environment or from cloud environment. I have collected it from various sources and listing here with short description. I hope it will be usefull for us in finding best tools for log collection and data collection from multple sources.
1. Flume:
It is an apache project and basically used for efficiently collection, aggregation, and moving large amounts of log data.It has simple architecture and it works on the basis of streaming data flow. It collects data from various sources and delivers it Hadoop's HDFS.
There is three basis component of flume
a) Agent- lives on the source machine from where we need to collect data or log
b) Collector- Agents sinks data to collector and finally it writes it to HDFS.
c) Master- It keeps all configuration of agents and collectors and manages them.
Please visit wikipedia and http://archive.cloudera.com/cdh/3/flume/UserGuide/ for more information.
2. Scribe:
Scribe is a open source project from Facebook and being used as log aggregation framework. It has simple API and uses.The scribe server running on every node in the system, configured
to aggregate messages and send them to a central scribe server or
servers in larger groups.
We can get more knowledge about it from here:
https://github.com/facebook/scribe/wiki
3. Kafka;
Being developed by linkedin and bascally used for log collection.
No comments:
Post a Comment