Lexical Columnization Processor

Modified on Tue, 30 Nov 2021 at 04:58 PM


Generate columns from a column containing multiple values and writes a value from another column into the corresponding one. 


The input dataset should contain at least one variable of type string. This related column contains the multiple values to columize (configured within the processor).


In the previous shown configuration, case sensitive means whether or not the processor should distinguish between upper and lower case letters.


The processor has two output ports:

  • Lexical Columization Output: Contains the input dataset along with the new columns, a column for each different value of the selected attribute.
  • Distinct Summary Output: The occurrence of each value in the selected attribute along with the corresponding fraction.

A bar chart of the percentage of each value in the selected attribute is also provided within the processor under the result tab.


Example Input


Example Configuration


  • Lexical Columization Output

The column "Col_Passau" contains the type if the location is passau in the corresponding entry, otherwise it contains an empty string. The same goes for the column "Col_Weimar".


  • Distinct Summary Output

This table indicates the count of the different values in the corresponding column and their percentage.

The previous result is provided by the processor and can be viewed under the "Result" tab.

Related Articles

Columization Processor

Lexical Binarization Processor

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article