- Capture Layer
- Enrichment Layer
- Indexing Layer
- Historical (Storage) Layer
- Service Layer
- View Layer
- Transversal Services
- What is Next?
Redborder platform manager consist in a set of layers that can individually scale up and that are designed to provide the best performance for the most demanding scenarios. For each layer the most appropriate technologies are selected and progressively upgraded ensuring that they offer the best functionality and performance for the whole system.
This design allows the platform to collect and process millions of events per second from networks of any size in real time and scale-Out power for managing probes, data sources, network devices, analysis activity and users in a multitenant and cloud-ready infrastructure.
The following figure presents and overview of the redBorder manager layers together with a brief description of the layer’s main objective.
The capture layer consists in a set of collectors to retrieve data from data sources. Information coming from the collectors are normalized and processed in a scalable messaging system.
This message bus is implemented using Kafka distributed streaming platform, building a fully scalable, fault tolerant and powerful layer.
The capture layer run as a cluster on one or more servers and stores streams of records in categories called topics. Topics are stored in different Partitions to optimize processing of the data. Partitions are replicated to offer data redundancy.
In addition to this, data capturing can be done through a load balancer in order to provide high availability to the layer.
Within the CAPTURE Layer an specific topic (called Vault) with an associated process is used to bring correlation between topics from different domains
The ENRICHMENT layer is responsible for adding additional information (metadata) to an specific topic related data. This metadata can be obtained from another topics (datasources) or form system specific storage (i.e master tables).
When combing different data sources of a specific domain data, one of the data sources is configured as master data source. An example of this is the network traffic domain, which has a master data source (netflow topic) and several enrichment data sources (dns, radius, snmp, etc).
INDEXING layer is responsible for structuring the data into segments to improve data retrieval operations and offer the best performance to the system. This layer allows to query fluently large amounts of data including online and historical information.
The layer also is capable of pre-calculating predefined metrics for the main dimensions of the topics. Doing this calculation “on the fly” eliminates workload on the query processing, optimizing performance to their execution.
INDEXING layer is built over the well-known big-data storage Druid system providing a high-performance and scalable distributed system capable of doing sub-second ad-hoc queries to group, filter, and aggregate data.
Historical (Storage) Layer
Historical (storage) layer organizes indexes data into Tier and Segment providing scalability and performance optimization features to the system. Tier classification allows to organize data into different storages dividing in groups (Tiers) depending, for example, taking in account how often the information is required by the service layer. A typical configuration is to create Hot, MEDIUM and LONG TERM tiers.
Dividing data into tiers allows also to assign resources more efficiently. For example, HOT tier will have assigned more memory (RAM) and less hard disk space. On the other hand, LONG TERM data tier will need more storage space and less RAM.
Another feature of this layer is the configuration of namespaces, assigning for example each namespace to a different Customer, which gives the platform the multitenancy capability.
Given a data query request launched by the the View layer, the Service Broker layer is the responsible for deciding which module or service to consult to obtain the information requested. This layer offers a set of query services through a API (Broker API), which is only accessible by the View Layer.
VIEW layer is the responsible for launching the data queries requested by the user (through the web application) or by other system (through the API REST). After receiving the response from the Service Broker layer, the view layer presents the results to the user/system in an appropriate format. This includes graphical representation, dashboards, reports, tables, plain format (json for API REST responses), etc.
This module groups all those services that are or can be used by any of the components of the system. This service includes:
- Configuration Service: this service is responsible for maintaining the coherence across all configuration files allocated in the different layers and nodes of the system. Configuration is done in one point and then replicated to all nodes and layer. This service is implement with Chef configuration management tool.
- Cache Service: offers caching services to the different layer eliminating bottlenecks of and providing predictable latency and fast response time to reach the growing mass of data. Chache services are implemented in different technologies depending on the selected deployment environment.
- Metadata Storage Service: implemented over a Postgressql database, this service stores data about the structure of the sensors, configured data sources, company, locations, users and other useful information. Some of this data is used to enrich data received by the Capture Layer.
- Druid coordinator: it manages what segments and partitions must be loaded risft to optimize query execution.
- Distributed Coordination Service: based on Zookeeper, it provides services for the distributed modules including synchronization, configuration maintenance, and groups and naming.