Overview
The Data Type Recommendation Processor generates a JSON result containing type recommendations for each column of a selected dataset.
Input
The processor does not accept direct input. Selecting the dataset for which the types should be determined for is done in the configuration interface under "Data Set". This field is mandatory.
Configuration
Allow Null Value: By default it's turned off and columns containing null values will not get any recommendations. If activated, null values do not influence the result.
Top-k Sampling: By default, Top-k Sampling is activated and the first k rows of the dataset will be used. When disabled, k random rows will be chosen. For larger datasets Top-k Sampling is faster, as the whole dataset has to be loaded for Random Sampling.
Additional Timestamp Masks: The formats have to comply with the specification of the Java class SimpleDateFormat.
Output
The processor does not produce any direct output. Type recommendations are shown in a JSON format under "Result" and "JSON Result" in the processor.
Example
Example Input
The following data is used to create a dataset with the Data Table Save processor in a separate workflow.
Example Configuration
The created dataset is then selected in the configuration of the Data Type Recommendation processor. The other configuration fields are set to the default configuration, which is the one shown above.
Result
After running the workflow, only representation type String is recommended for column "Cities", while for column "Date" both type String and DateTime are possible. Scale types are recommended suitable for the representation types.
The processor does not take the initial column types from the dataset into account, but tries to infer the types by going through column samples (defined by "Sample Size"). As a result, numerical columns can be inferred to also be of type String.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article