How does the NCache Aggregator handle distributed data?

The Aggregator uses a ValueExtractor to identify meaningful attributes and then performs parallel analytical operations across all nodes in the cluster and returns a single compiled result to the client.

Which mathematical operations are supported by NCache BuiltinAggregator?

NCache provides native support for AVG, SUM, MIN, MAX, COUNT, and DISTINCT operations for different data types including Integers, Decimals, Strings, and DateTime.

Aggregator (MapReduce) Components and Working [Deprecated]

NCache Aggregator is a data processing component that uses the MapReduce framework and is designed to perform analytical tasks in parallel on a cache cluster to produce statistical results. Although it is a specialized MapReduce Task, the Aggregator is particularly optimized for mathematical functions such as addition, averaging, and counting of values. The Aggregator performs the logic on the data nodes, which decreases latency and network traffic as opposed to client-side processing

How Does the Aggregator Work?

The Aggregator has the following components:

ValueExtractor

This component extracts the meaningful attributes from the given object, similar to the Mapper in the MapReduce Framework.

Aggregator

The actual grouping and analytical operations take place here as in the Combiner and Reducer of MapReduce. The following operations are supported in the built-in Aggregator of NCache, the BuiltinAggregator:

Operation	Description	Supporting Data Types
`AVG`	Returns the average of the given data present in the cache. The data is returned cumulatively from all the nodes in the cluster.	`Integer`, `Double`, `Float`, `BigInteger`, `Long`, `Short`, `Decimal`
`SUM`	Returns the sum of the value of the item in the data set.	`Integer`, `Double`, `Float`, `BigInteger`, `Long`, `Short`, `Decimal`
`MIN`	Returns the least value of the item in the data set.	`Integer`, `Double`, `Float`, `BigInteger`, `Long`, `Short`, `Decimal`, `String`, `DateTime`
`MAX`	Returns the maximum value of the item in the data set.	`Integer`, `Double`, `Float`, `BigInteger`, `Long`, `Short`, `Decimal`, `String`, `DateTime`
`COUNT`	Returns the total number of occurrences of the item in the data set.	`Integer`, `Double`, `Float`, `BigInteger`, `Long`, `Short`, `Decimal`, `String`, `DateTime`
`DISTINCT`	Returns the unique occurrence of the item in the data set.	`Integer`, `Double`, `Float`, `BigInteger`, `Long`, `Short`, `Decimal`, `String`, `DateTime`

If the Aggregator’s MapReduce Task fails due to any exception, an exception will be thrown about the Task failure.
If the result returned after Aggregator execution is null, it will return the default value of the built-in Aggregator for that data type.
Apart from these built-in features, users can also provide their aggregations such as Mean, Median, or Mode. These are logical statistical functions and the user can make as many variants of the aggregation suited to their needs.
The users can provide their own data types as well such as custom objects as the Map Reducer takes the value of the type Object.

NCache provider has also built-in implementation needed for the Aggregator to work for the listed types. However, if the user wishes to use the Aggregator for custom types and their implementation of the aggregation, they can achieve this by simply implementing the two interfaces, IValueExtractor and IAggregator.

The implementation of the IValueExtractor interface will contain the Extract() method at least, that will be used by the internal framework to identify the type of an instance by simply returning the type of the object passed to it. On the other hand, the IAggregator interface contains the signatures of the two methods to be implemented; Aggregate() and AgregateAll(). The job of the Value Extractor is to return the filtered data to the Aggregator. The Aggregator then works on the given data set to produce more refined data.

Aggregator (MapReduce) Components and Working [Deprecated]

How Does the Aggregator Work?

See Also

Contact Us