General
Term | Icon | Description |
---|---|---|
SYSTEM | ONE DATA | |
Navigation | Navigation menu within ONE DATA to navigate to Resources, ONE DATA configurations and to the Application Health. | |
Action Bar | Vertical bar within ONE DATA to manage Resources or to configure/execute Workflows and Reports. | |
Tags | Keywords which can be added to Resources. | |
Resources | Data Tables, Workflows, Reports, Keyrings and Models | |
Application Health | ![]() | Overview over current status of ONE DATA instance: Versions, Start Time, Remaining/Used space, Spark Status, Environment Information |
Apache Spark | The computation logic of ONE DATA is mostly based on this analytics engine of Apache. | |
Apache Hadoop | We use the HDFS (Hadoop Distributed File System) to store the data when ONE DATA runs on a cluster. | |
Docker | Currently it is only used in combination with the Python Integration. |
User/Rights Management
Term | Icon | Description |
---|---|---|
Domain | ![]() | Domain is a closed area that can host multiple projects. It is the highest level within ONE DATA‘s rights management. |
Project | ![]() | Data Science tasks will be structured in Projects that contain multiple ONE DATA Resources. Projects are restricted via User and Rights Management. The different access rights are only provided via Groups and it is not possible to provide special access rights to a single user without a Group. |
Role |
| |
Group | ![]() | Summarizes users with same access rights. Each Group is assigned a Role inside a Project. A user can be a member of multiple Groups. |
Subgroup | Can be used for visualization purposes of groups, but Subgroups have always the same access rights as their parental Group. | |
Access Relation | Role of a Group inside a Project. | |
Sharing of Resources | All kind of Resources can be shared between different Projects. | |
Analysis Authorization | ![]() | It is possible to define Groups/users with special access rights to the data. |
Viewer License | A viewer license user is a user who can only access reports. | |
Viewer Group | A viewer group summarizes viewer license users with the same roles. |
Resources
Data Tables | A table, which is permanently stored inside ONE DATA. These tables can be used and changed within/via Workflows. It is also possible to upload a data table using a .csv file. | |
Workflow | ![]() ![]() | Workflows contain the algorithmic computation of data. They consist of the combination of different Processors using a certain execution logic to determine a specific result. Workflows can be executed either in Spark, R or Python context. |
Report | ![]() ![]() | The Report of a Workflow is a configurable user interface to provide configurations to the Workflow and to visualize the Workflow results. It is also possible to run the Workflow from the Report view. |
Model | Machine Learning Model, which was previously trained and stored afterwards within ONE DATA. | |
Keyring | Set of authentication keys in ONE DATA. |
Data Tables/Data Storage
Term | Icon | Description |
---|---|---|
ETL Source | A special kind of read-only Data Table. It refers to an actual database table and always contains the same entries as the table in the database. | |
Parquet | Apache Parquet is a column-oriented data store. It is possible to choose between parquet and csv when writing to a Data Table. | |
Representation Type | The data-type of the table columns. It can be string, int, double, numeric or datetime. | |
Scale Type | It specifies the scaling of the table columns. It can be nominal, interval, ordinal or ratio. |
Workflows
Term | Icon | Description |
---|---|---|
Workflow Variable | Variable defined within a Workflow in order to specify ranges of values which can applied in configurations of Processors. | |
Variable Manager | ![]() | Overview over all Workflows Variables within a single Workflow. Here, we can also add and remove variables. |
Node/Processor | A Processor is a logical unit and represents a functionality that is applied when entering data. A list of all Processors and their documentation can be found here: Processor Documentation | |
Group | ![]() | Possibility to structure a Workflow into different logical parts. |
Import/Export | ![]() ![]() | Is is possible to import/export certain nodes or entire workflows. |
Schedule | ![]() | Project-specific resource for scheduling automatic Workflow executions. Inherits data structure from generic resource. Holds exactly one Schedule Scheme. |
Schedule Scheme | Collection of timing specifications that form the future execution plan for a Schedule. Can contain one to many elements. The scheme elements - namely Schedule Rules - trigger Schedule execution additive, i.e. when two elements define a future execution at overlapping times, both executions will be triggered. | |
Schedule Rule | One single planned execution rule saved for exactly one Schedule. Can be on a CRON basis. These elements define the points in time, where the Schedule triggers Workflow execution. | |
Job | ![]() | One execution of a Workflow. Via Show Jobs it is possible to list all recent jobs to see execution times, the job status (successful/not successful) and to switch between Workflow versions (main versioning mechanism). |
R Integration | It is possible to run R scripts within Workflows. | |
Python Integration | It is possible to run Python scripts within Workflows. Therefore using Python context. | |
SQL | There are several Processors which allow the user to enter (Spark) SQL queries to apply these queries directly to the provided tables, e.g. Query Processor, Double Query Processor, Flexibel ETL Load Processor, ... | |
Microservice | For external applications it is possible to communicate via Microservices with ONE DATA. There are Microservices Input and Microservices Output Processors. These Processors have to be actively enabled in order to be usable. | |
API Load | It is possible to load data (within Workflows) from REST APIs that can be targeted with GET requests. | |
GeoJSON | Processor which awaits GeoJSON data as input. After the Workflow execution the Processor result can be visualized within a Report using a Visualization Container and selecting a Geo Chart. | |
Production Line | Planned Feature: To link Workflows and execute them after another. |
Node/Processor
Term | Icon | Description |
---|---|---|
Processor Input | Table, which is provided via an edge to the Processor as input (above the Processor). It is possible to have multiple inputs. | |
Processor Output | Table, which is provided via an edge by the Processor as output (below the Processor). It is possible to have multiple outputs. | |
Config Element | It is possible to open a Processor (double click) to configure it. All possible configuration elements | |
Composed Config Element | A Composed Config Element is a special kind of Config Element inside a Processor. This kind can be defined once for each column of the Processor Input. It is used to specify the selected column using multiple configurations. |
Reports
Term | Icon | Description |
---|---|---|
Container | Reports can be created by arranging different kind of Containers. There are currently three Container types:
| |
Report Grid | It is possible to define the dimensions of a Report via a grid (number of rows and columns can be defined). When Containers are positioned or resized inside a Report they automatically adjust to the grid. |
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article