Python SDK - Manipulate Data Tables

Modified on Thu, 25 Nov 2021 at 09:33 AM

Provided Methods

Method documentation

All the examples expect that there is already a OneDataApi object initialized and assigned to a variable called "onedata_api".

### Creation of the ONE DATA Api ###
onedata_api = OneDataApi(base_url=base_url, username=user, password=pw, verify=False, timeout=999)

create

Description

Creates a data table. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: UploadedDataResponse

Parameters

PropertyTypeRequiredDefaultDescription
namestrtrue-name of the data table
headers[str]true-list of strings, defining the headers. E.g. [{"a": 1, "b": "asd"}] where a and b are headers
content[dict]true-list of dictionaries, defining the content. Each dictionary must have all attributes (headers)
project_idUnion[str, uuid.UUID]true-UUID if the project
lockedDatasetLocktrue-defines the lock state of the data table
storage_formatDataTableStorageTypetrue-defines the storage type of the data table
create_fallbackboolfalseTruedefines whether a fall back data table should be created (contains all malformed entries)
connection_schemastrfalseNoneData base schema which holds the relevant table
target_project_idUnion[str, uuid.UUID]falseNoneID of the projet where the new data table should be created

Usage

### Creation of a datatable ###
response: UploadedDataResponse = onedata_api.datatables.create(name="dt via sdk", headers=["a", "b"],

getContent

Description

Retrieves the paginated content of a DataTable with the given data id XOR frt id or by name and project id. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

For retrieving the content as a pandas.Dataframe, please specify the custom deserializer 'ResponseToPandasDataFrameDeserializer'. All officially documented functionality of a pandas.Dataframe can then be used (https://pandas.pydata.org/pandas-docs/stable/reference/frame.html).

Returns: DataTableContent

Parameters

PropertyTypeRequiredDefaultDescription
data_idUnion[str, uuid.UUID]falseNoneID of the data table. Can not be used with frt_id.
frt_idUnion[str, uuid.UUID]falseNoneFRT ID of a data table. Can not be used with data_id
namestrfalseNonename of the data table
project_idUnion[str, uuid.UUID]falseNoneId of the project the data tables belongs to. When get by id, project_id will be ignored.
pageintfalse1the page that should be fetched
pagination_sizeintfalse10000the number of entries per page which should be fetched
data_transformers[dict]falseNonelist of transformation dicts applied to retrieved data table
force_exact_matchboolfalseTrueSearch for exact match, False: Data table includes the name
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match

Usage

### Retrieval of datatable content ###
datatable: DataTableContent = onedata_api.datatables.get_content(name=datatable_name,

getContentSamples

Description

Retrieves samples from the content of a DataTable with the given data id/frt id or by name and project id. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: DataTableContent

Parameters

PropertyTypeRequiredDefaultDescription
data_idUnion[str, uuid.UUID]falseNoneID of the data table. Can not be used with frt_id
frt_idUnion[str, uuid.UUID]falseNoneFRT ID of a data table. Can not be used with data_id
namestrfalseNonename of the data table
project_idUnion[str, uuid.UUID]falseNoneId of the project the data tables belongs to. When get by id, project_id will be ignored.
sample_countintfalse20the number of samples which should be retrieved. The server does not guarantee that the exact number is returned, but it tries to stay as close as possible.
data_transformers[dict]falseNonelist of transformation dicts applied to retrieved data table
force_exact_matchboolfalseTrueSearch for exact match, False: Data table includes the name
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match

Usage

### Retrieval of datatable content samples ###
datatable: DataTableContent = onedata_api.datatables.get_content_samples(data_id=datatable_id, sample_count=100)

replaceContent

Description

Replaces the entire content of the data table with the given data id/frt id or by name and project id with the new_data.

The schema must match the existing one!

Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: None

Parameters

PropertyTypeRequiredDefaultDescription
data_idUnion[str, uuid.UUID]falseNoneID of the data table. Can not be used with frt_id
frt_idUnion[str, uuid.UUID]falseNoneFRT ID of a data table. Can not be used with data_id
namestrfalseNonename of the data table
project_idUnion[str, uuid.UUID]falseNoneId of the project the data tables belongs to. When get by id, project_id will be ignored
force_exact_matchboolfalseTrueSearch for exact match, False: Data table includes the name
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute on first match
new_data[dict]falseNonelist of dictionaries, defining the new data. E.g. [{"a": 1, "b": "asd"}] where a and b are headers
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match

Usage

### Replacement of datatable content###
onedata_api.datatables.replace_content(data_id=id, new_data=[{"a": 2, "b": "newer text"}])

updateContent

Description:

Updates the rows given as updated of the data table with the given data id/frt id or by name and project id. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: None

Parameters

PropertyTypeRequiredDefaultDescription
data_idUnion[str, uuid.UUID]falseNoneID of the data table. Can not be used with frt_id
frt_idUnion[str, uuid.UUID]falseNoneFRT ID of a data table. Can not be used with data_id
namestrfalseNonename of the data table
project_idUnion[str, uuid.UUID]falseNoneId of the project the data tables belongs to. When get by id, project_id will be ignored
force_exact_matchboolfalseTrueSearch for exact match, False: Data table includes the name
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute on first match
updated[dict]falseNonelist of dictionaries, defining the rows to update. E.g. [{"a": 1, "b": "asd"}] where a and b are headers
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match

Usage

content: DataTableContent = onedata_api.datatables.get_content(data_id=dt.id)
print(f"content: {content.records}")
###
# content: [{'a': 1, 'b': 'asd', '__ROW_ID__': '1f088a87-67e9-4244-a7ca-d5159d2f1c8d'}, {'a': 1, 'b': 'asd', '__ROW_ID__': '936e46fd-59c0-4d93-8575-daad2310f398'}]
###

updated_record = content.records[1]
updated_record["b"] = "changed"

### Update of datatable content###
onedata_api.datatables.update_content(data_id=dt.id, updated=[updated_record])

content: DataTableContent = onedata_api.datatables.get_content(data_id=dt.id)

print(f"content: {content}")
###
# content: [{'a': 1, 'b': 'changed', '__ROW_ID__': '936e46fd-59c0-4d93-8575-daad2310f398'}, {'a': 1, 'b': 'asd', '__ROW_ID__': '1f088a87-67e9-4244-a7ca-d5159d2f1c8d'}]
###

deleteContent

Description

Deletes the rows with ids given as deleted from the data table with the given data id/frt id or by name and project id. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: None

Parameters

PropertyTypeRequiredDefaultDescription
data_idUnion[str, uuid.UUID]falseNoneID of the data table. Can not be used with frt_id
frt_idUnion[str, uuid.UUID]falseNoneFRT ID of a data table. Can not be used with data_id
namestrfalseNonename of the data table
project_idUnion[str, uuid.UUID]falseNoneId of the project the data tables belongs to. When get by id, project_id will be ignored
force_exact_matchboolfalseTrueSearch for exact match, False: Data table includes the name
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute on first match
deleted[Union[str, uuid.UUID]]falseNonelist of row ids, defining the rows that should be deleted
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match

Usage

content: DataTableContent = onedata_api.datatables.get_content(data_id=dt.id)
print(f"content: {content.records}")
###
# content: [{'a': 1, 'b': 'asd', '__ROW_ID__': 'bf60f922-129d-4de9-95f1-a90b83ed34c1'}, {'a': 1, 'b': 'asd', '__ROW_ID__': 'fd6f7053-5438-463f-9ad5-a3204321678c'}]
###

### Deletion of datatable content###
onedata_api.datatables.delete_content(data_id=dt.id, deleted=[content.records[0]["__ROW_ID__"]])

content: DataTableContent = onedata_api.datatables.get_content(data_id=dt.id)

print(f"content: {content.records}")
###
# content: [{'a': 1, 'b': 'asd', '__ROW_ID__': 'fd6f7053-5438-463f-9ad5-a3204321678c'}]
###

addContent

Description

Creates a DataTable with the given name in the given data id/frt id or by name and project id. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: None

Parameters

PropertyTypeRequiredDefaultDescription
data_idUnion[str, uuid.UUID]falseNoneID of the data table. Can not be used with frt_id
frt_idUnion[str, uuid.UUID]falseNoneFRT ID of a data table. Can not be used with data_id
namestrfalseNonename of the data table
project_idUnion[str, uuid.UUID]falseNoneId of the project the data tables belongs to. When get by id, project_id will be ignored
force_exact_matchboolfalseTrueSearch for exact match, False: Data table includes the name
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute on first match
added[dict]falseNonelist of dictionaries, defining the rows that should be added
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute first match
Usage
content: DataTableContent = onedata_api.datatables.get_content(data_id=dt.id)
print(f"datatable: {content.records}")
###
# datatable: [{'a': 1, 'b': 'some text', '__ROW_ID__': 'b513b7aa-5e99-43e3-a937-df2c854bf0eb'}]
###

add_row_content = {"a": 100, "b": "added text"}

### Addition of datatable content###
onedata_api.datatables.add_content(ata_id=dt.id, dded=[add_row_content])

content: DataTableContent = onedata_api.datatables.get_content(data_id=dt.id)

print(f"content: {content.records}")
###
# content: [{'a': 100, 'b': 'added text', '__ROW_ID__': '791a985e-d6bc-40e7-8681-0fac3229442b'}, {'a': 1, 'b': 'some text', '__ROW_ID__': 'b513b7aa-5e99-43e3-a937-df2c854bf0eb'}]
###

patchContent

Description

Updates the rows given as updated, deletes the rows with ids given in deleted and adds new rows given in added for a data table with the given data_id. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: None

Parameters

PropertyTypeRequiredDefaultDescription
data_idUnion[str, uuid.UUID]falseNoneID of the data table. Can not be used with frt_id
frt_idUnion[str, uuid.UUID]falseNoneFRT ID of a data table. Can not be used with data_id
namestrfalseNonename of the data table
project_idUnion[str, uuid.UUID]falseNoneId of the project the data tables belongs to. When get by id, project_id will be ignored
force_exact_matchboolfalseTrueSearch for exact match, False: Data table includes the name
force_distinctboolfalseTrueRaise exception if name is ambiguous, False: execute on first match
deleted[Union[str, uuid.UUID]]falseNonelist of row ids, defining the rows that should be deleted
updated[dict]falseNonelist of dictionaries, defining the rows that should be updated
added[dict]falseNonelist of dictionaries, defining the rows that should be added

Usage

content: DataTableContent = onedata_api.datatables.get_content(data_id=dt.id)

print(f"row1: {datatable.records[0]}")
print(f"row2: {datatable.records[1]}")
print(f"row3: {datatable.records[2]}")

###
# row1: {'a': 1, 'b': 'some text', '__ROW_ID__': '39eb1921-2d26-4c78-9fa6-592a98cfbe34'}
# row2: {'a': 1, 'b': 'some text', '__ROW_ID__': '69e958dc-6476-4ae1-bf2f-154b99c0d782'}
# row3: {'a': 1, 'b': 'some text', '__ROW_ID__': '425d2403-fd52-4514-b6e5-4802ed1d74a7'}
###

print("-----------DataTablePatchContent-----------")

onedata_api.datatables.patch_content(data_id=id,
                                     added=[{"a": 2, "b": "added"}],
                                     deleted=[row1["__ROW_ID__"]],
                                     updated=[{"a": 3, "b": "updated", "__ROW_ID__": row2["__ROW_ID__"]}])

datatable: DataTableContent = onedata_api.datatables.get_content(data_id=id)

print(f"row1: {datatable.records[0]}")
print(f"row2: {datatable.records[1]}")
print(f"row3: {datatable.records[2]}")

###
# row1: {'a': 1, 'b': 'some text', '__ROW_ID__': '425d2403-fd52-4514-b6e5-4802ed1d74a7'}
# row2: {'a': 2, 'b': 'added', '__ROW_ID__': 'ef7400d9-df9f-4174-b92c-353a8bbf081c'}
# row3: {'a': 3, 'b': 'updated', '__ROW_ID__': '69e958dc-6476-4ae1-bf2f-154b99c0d782'}
###

createFromPandas

Description

Creates a DataTable with the given name in the given project. Optional request options (request_transformers, response_transformers, timeout, verify, sleep_after_response_millis, deserializer) can be specified.

Returns: UploadedDataResponse

Parameters

PropertyTypeRequiredDefaultDescription
namestrtrueNonename of the datatable
project_idUnion[str, uuid.UUID]trueNoneproject the datatable is created in
datapandas.DataFrametrueNonepandas dataframe to create a datatable from
lockedDatasetLockfalseDatasetLock.Opendefines the lock state of the data table
storage_formatDataTableStorageTypefalseDataTableStorageType.PARQUETdefines the storage format of the data table
create_fallbackboolfalseTruedefines whether a fall back data table should be created (contains all malformed entries)
deleted[Union[str, uuid.UUID]]falseNonelist of row ids, defining the rows that should be deleted
updated[dict]falseNonelist of dictionaries, defining the rows that should be updated
added[dict]falseNonelist of dictionaries, defining the rows that should be added

Usage

d = {'col1': [1, 2], 'col2': [3, 4]}
pandas_dataframe: DataFrame = pandas.DataFrame(data=d)

print(f"pandas_dataframe: {pandas_dataframe}")

###
# pandas_dataframe:
#      col1  col2
# 0     1     3
# 1     2     4
###

### Creation of datatable by pandas t###
response = onedata_api.create_from_pandas(name=name, project_id=project_id, data=pandas_dataframe, locked=locked,
                                          storage_format=storage_format, create_fallback=create_fallback)

content: DataTableContent = onedata_api.get_content(data_id=response.data[0].id)

print(f"content: {content.records}")
###
# content: [{'col1': 1, 'col2': 3, '__ROW_ID__': '8609f7d5-bf29-4ffa-aba6-a61aa9342156'}, {'col1': 2, 'col2': 4, '__ROW_ID__': '5cb17b5c-ff22-4ad6-b842-790af14a30e4'}]
###

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article