Group By Aggregation Processor

Modified on Tue, 30 Nov 2021 at 01:16 PM

Overview

The Group By Aggregation Processor groups the dataset by one variable and aggregates the remaining columns using a specified aggregation function. 


Input

The Processor can operate on any kind of input dataset.

Warning: Please note that some aggregation functions like AVG or SUM require a numeric input in order to provide a meaningful result. These functions would still execute on string values, but the result is of no use.


Configuration


Output

The dataset is grouped by the selected column(s). The entries in the remaining columns are pooled accordingly, applying aggregation operations. Note that aggregating all remaining columns might not always make perfect sense. In that case you can select the relevant columns using the Column Selection Processor as shown in the example below.


Note That if no group by column is selected, the aggregation will be performed on all columns and all entries (the output result has one line)


Example

In the following example we're using an accommodation input dataset. The goal is to output the average accommodation price in each city. Therefore we use the column selection processor to select the location and price of each entry.


Example Input

Workflow


Example Configuration


Result


Please note that only one aggregation function is allowed


Related Articles

Column Selection Processor

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article