USERNAME = "gnu403"
PASSWORD = "Test!234567"Seer SDK Guide
Installation
The package can be installed and imported using the official SeerSDK package on pip. This can be installed using:
$ pip install seer-pas-sdk
Usage
This page gives an overview of the SDK’s feature. Complete documentation for each class / method can be found here.
Configuration
PAS has a simple authorization system that just involves your username and password fields like on the web app. You can define your username and password for your own ready reference and convenience as follows:
You may also choose to pass in an instance param in the SDK object to instantiate the PAS SDK to the EU or US instance.:
INSTANCE = "US"By default, if you don’t specify an instance, the PAS SDK will be instantiated with the US instance.
Instantiation
After importing the SeerSDK module, you can instantiate an object in the following way:
from seer_pas_sdk import SeerSDK
# Instantiate an SDK object with your credentials:
sdk = SeerSDK(USERNAME, PASSWORD)
# You could alternatively pass your credentials and/or the instance directly into the instantiated object.
sdk = SeerSDK(USERNAME, PASSWORD, INSTANCE)User 'gnu403' logged in.
You can then use the SDK’s methods to create, query, or retrieve projects, plates, samples, and analyses. Full documentation can be found here. Additional information and examples can also be found below.
Multi Tenant Management
Introduced in version 0.2.0
By default, you will be active in your home tenant upon log in. The home tenant is defined as the organization account that issued the original invitation for the user to join PAS. The optional ‘tenant’ parameter is available in the SeerSDK constructor to navigate directly to a desired tenant. A notification message will display upon login.
The following tools are available to navigate between tenants:
from seer_pas_sdk import SeerSDK
sdk = SeerSDK(USERNAME, PASSWORD, INSTANCE, tenant='My Active Tenant')
# Retrieve value of current active tenant
print(sdk.get_active_tenant())
# List available tenants
print(sdk.list_tenants())
# Switch active tenant
sdk.switch_tenant('My Next Tenant')User 'gnu403' logged in.
You are now active in My Active Tenant
My Active Tenant
{'My Active Tenant': 'abc1234abc1234', 'My Next Tenant': 'abc1234abc1232'}
You are now active in My Next Tenant
PlateMap Object
The PAS Python SDK would allow users to make plate maps file from within the SDK using the PlateMap class. The interface of the plate map file contains the following parameters, all of which need to be passed in as lists:
- MS file name as
ms_file_name - Sample name as
sample_name - Sample ID as
sample_id - Well location as
well_location - Nanoparticle as
nanoparticle - Nanoparticle ID as
nanoparticle_id - Control as
control - Control ID as
control_id - Instrument name as
instrument_name - Date sample preparation as
date_sample_preparation - Sample volume as
sample_volume - Peptide concentration as
peptide_concentration - Peptide mass sample as
peptide_mass_sample - Dilution factor as
dilution_factor - Kit ID as
kit_id - Plate ID as
plate_id - Plate Name as
plate_name
Based on the length of the ms_file_name list passed, if the number of parameters passed in for the rest of the fields are less than the length of the ms_file_name list, then the rest of the fields are defaulted to None. If more number of fields are passed, then the class would throw a ValueError
This is how a plate map file could be made:
from seer_pas_sdk import PlateMap
sample_plate_map_file = PlateMap(
ms_file_name =["TestFile1.raw", "TestFile2.raw"],
sample_name = ["A111", "A112"],
sample_id = ["A111", "A112"],
well_location = ["C11", "D11"],
nanoparticle = ["NONE"],
nanoparticle_id = ["NONE"],
control = ["MPE Control"],
control_id = ["MPE Control"],
instrument_name = [],
date_sample_preparation = [],
sample_volume = [20],
peptide_concentration = [59.514],
peptide_mass_sample = [8.57],
dilution_factor = [1],
kit_id = [],
plate_id = ["A11", "A11"],
plate_name = ["A11", "A11"]
)
# Or alternatively, this would be the same as the following (we've left some fields empty which would default them to `None`):
another_plate_map_file = PlateMap(
ms_file_name =["TestFile1.raw", "TestFile2.raw"],
sample_name = ["A111", "A112"],
sample_id = ["A111", "A112"],
well_location = ["C11", "D11"],
nanoparticle = ["NONE"],
nanoparticle_id = ["NONE"],
control = ["MPE Control"],
control_id = ["MPE Control"],
sample_volume = [20],
peptide_concentration = [59.514],
peptide_mass_sample = [8.57],
dilution_factor = [1],
plate_id = ["A11", "A11"],
plate_name = ["A11", "A11"]
)
import pickle
print(pickle.dumps(another_plate_map_file) == pickle.dumps(sample_plate_map_file)) # checks for equality
print(sample_plate_map_file)True
{'MS file name': {0: 'TestFile1.raw', 1: 'TestFile2.raw'}, 'Sample name': {0: 'A111', 1: 'A112'}, 'Sample ID': {0: 'A111', 1: 'A112'}, 'Well location': {0: 'C11', 1: 'D11'}, 'Nanoparticle': {0: 'NONE', 1: None}, 'Nanoparticle ID': {0: 'NONE', 1: None}, 'Control': {0: 'MPE Control', 1: None}, 'Control ID': {0: 'MPE Control', 1: None}, 'Instrument name': {0: None, 1: None}, 'Date sample preparation': {0: None, 1: None}, 'Sample volume': {0: 20, 1: None}, 'Peptide concentration': {0: 59.514, 1: None}, 'Peptide mass sample': {0: 8.57, 1: None}, 'Recon volume': {0: None, 1: None}, 'Dilution factor': {0: 1, 1: None}, 'Kit ID': {0: None, 1: None}, 'Plate ID': {0: 'A11', 1: 'A11'}, 'Plate Name': {0: 'A11', 1: 'A11'}, 'Assay': {0: None, 1: None}}
You could also convert the PlateMap object into a DataFrame using the to_df function implemented within the class. Example:
pm_file_df = sample_plate_map_file.to_df()
print(pm_file_df) MS file name Sample name Sample ID Well location Nanoparticle \
0 TestFile1.raw A111 A111 C11 NONE
1 TestFile2.raw A112 A112 D11 None
Nanoparticle ID Control Control ID Instrument name \
0 NONE MPE Control MPE Control None
1 None None None None
Date sample preparation Sample volume Peptide concentration \
0 None 20.0 59.514
1 None NaN NaN
Peptide mass sample Recon volume Dilution factor Kit ID Plate ID \
0 8.57 None 1.0 None A11
1 NaN None NaN None A11
Plate Name Assay
0 A11 None
1 A11 None
You could also convert it to a CSV file (which can also be exported) using the to_csv function implemented within the class:
pm_file_csv = sample_plate_map_file.to_csv()
print(pm_file_csv)MS file name,Sample name,Sample ID,Well location,Nanoparticle,Nanoparticle ID,Control,Control ID,Instrument name,Date sample preparation,Sample volume,Peptide concentration,Peptide mass sample,Recon volume,Dilution factor,Kit ID,Plate ID,Plate Name,Assay
TestFile1.raw,A111,A111,C11,NONE,NONE,MPE Control,MPE Control,,,20.0,59.514,8.57,,1.0,,A11,A11,
TestFile2.raw,A112,A112,D11,,,,,,,,,,,,,A11,A11,
[Optional] Logging
For your own convenience, you can define a log function that allows you to print data from the SDK in a much better and readable manner.
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 1000)
pd.set_option('display.colheader_justify', 'center')
pd.set_option('display.precision', 3)
def log(fn):
if isinstance(fn, pd.DataFrame):
print(fn.head()) # shorten the output
else:
print(fn)
def log_df(fn):
for entry in fn:
fn[entry] = fn[entry].head()
print(fn)Examples
Get Spaces
Fetches a list of spaces for the authenticated user.
Params
None.
Returns
spaces: (list[dict]) List of space objects for the authenticated user.
Example
spaces = sdk.get_spaces()
log(spaces)[{'id': None, 'usergroup_name': 'General', 'description': '', 'notes': '', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-09-13T21:59:33.569Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-09-13T21:59:33.569Z', 'userIds': None}, {'id': '150b3460-a7d2-11ed-9de7-d59d51e545d5', 'usergroup_name': 'My Space', 'description': 'My Space', 'notes': 'My Space', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-02-08T17:00:25.657Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-02-09T01:33:14.397Z', 'userIds': 'ad92b15e-79fe-444d-ac9b-15ef1d15e148,31c1af05-a1c0-4d1b-8073-08fa4bb4207a,e7f7026c-a23b-4732-bf34-8511f5ff83c9,69b7d189-fcc1-4cc8-ae6c-f4fa49012ea0,c972c073-20e4-40ce-825b-d5adb718e419,d4004b39-aeb5-4284-9117-0ca747abe7a6,906b2967-e4c1-4e66-a546-c3412da8a6e6,3d219df6-255f-4aad-af53-408cb24ccab6,1db51bfb-2df1-42c2-baee-693794a5ae66'}]
Get Plate
Description
Fetches a single plate for the authenticated user.
You must provide either a plate_id or a plate_name—but not both.
If a matching plate exists, it is returned.
Params
- plate_id (
str, optional): Unique ID of the plate to be fetched.
- plate_name (
str, optional): Name of the plate to be fetched.
Returns
plate (dict): A plate object for the authenticated user.
Examples
Fetch by plate_id
#| eval: false
plate_id = "00d1ac50-1149-11ee-85b5-bd8ec5ef4e32"
plate = sdk.get_plate(plate_id=plate_id)
log(plate){'id': '00d1ac50-1149-11ee-85b5-bd8ec5ef4e32', 'plate_name': 'finalPlateNameTest', 'plate_id': 'finalPlateIdTest', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-22T22:06:13.963Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-22T22:06:18.470Z', 'space': None}
Fetch by plate_name
#| eval: false
plate_name = "finalPlateNameTest"
plate = sdk.get_plate(plate_name=plate_name)
log(plate){'id': '00d1ac50-1149-11ee-85b5-bd8ec5ef4e32', 'plate_name': 'finalPlateNameTest', 'plate_id': 'finalPlateIdTest', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-22T22:06:13.963Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-22T22:06:18.470Z', 'space': None}
Find Plates
Fetches a list of plates for the authenticated user. If no plate_id is provided, returns all plates for the authenticated user. If plate_id is provided, returns the plate with the given plate_id, provided it exists.
Params
plate_id: (str, optional) Unique ID of the plate to be fetched, defaulted toNone.plate_name: (str, optional) Name of the plate to be fetched, defaulted toNone.project_id: (str, optional) Unique ID of the project to which the plate belongs, defaulted toNone.project_name: (str, optional) Name of the project to which the plate belongs, defaulted toNone.as_df: (bool, optional) whether the result should be converted to a DataFrame
Returns
plates: (list[dict] or DataFrame) list or DataFrame of plate objects for the authenticated user.
Example
plates = sdk.find_plates()
log(plates)[{'id': '303bfc20-87d7-11ee-b03f-27a54e8c1798', 'plate_name': 'some_plate_name', 'plate_id': 'unique223id', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-11-20T19:01:19.249Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-11-20T19:01:26.881Z', 'space': None}, {'id': '0ddf9920-87d7-11ee-b03f-27a54e8c1798', 'plate_name': 'some_plate_name', 'plate_id': 'unique223id', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-11-20T19:00:21.602Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-11-20T19:00:27.273Z', 'space': None}]
You can also pass in a specific plate_id to specifically fetch a plate.
plate_id = "00d1ac50-1149-11ee-85b5-bd8ec5ef4e32"
sample_plate = sdk.find_plates(plate_id)
log(sample_plate)[{'id': '00d1ac50-1149-11ee-85b5-bd8ec5ef4e32', 'plate_name': 'finalPlateNameTest', 'plate_id': 'finalPlateIdTest', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-22T22:06:13.963Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-22T22:06:18.470Z', 'space': None}]
With the as_df flag set to True:
sample_plate = sdk.find_plates(plate_id, as_df=True)
log(sample_plate) id plate_name plate_id description notes created_by created_timestamp last_modified_by last_modified_timestamp space
0 00d1ac50-1149-11ee-85b5-bd8ec5ef4e32 finalPlateNameTest finalPlateIdTest None None 04936dea-d255-4130-8e82-2f28938a8f9a 2023-06-22T22:06:13.963Z 04936dea-d255-4130-8e82-2f28938a8f9a 2023-06-22T22:06:18.470Z None
Find MS Runs
Fetches information pertaining to MS runs for passed in sample_ids (provided they are valid and contain relevant files) for an authenticated user.
The function returns a dict containing DataFrame objects if the as_df flag is passed in as True, otherwise a nested dict object is returned instead.
Params
sample_ids: (list) List of unique sample IDs.as_df: (bool, optional) whether the result should be converted to a DataFrame, defaulted to False.
Returns
res: (list[dict] or DataFrame) List or DataFrame of MS run objects for the authenticated user.
Example
sample_ids = ["812139c0-15e0-11ee-bdf1-bbaa73585acf", "803e05b0-15e0-11ee-bdf1-bbaa73585acf"]
example = sdk.find_msruns(sample_ids)
log(example)[{'id': '81c6a180-15e0-11ee-bdf1-bbaa73585acf', 'sample_id': '812139c0-15e0-11ee-bdf1-bbaa73585acf', 'raw_file_path': '7ec8cad0-15e0-11ee-bdf1-bbaa73585acf/20230628182044224/TestFile2.raw', 'well_location': 'D11', 'nanoparticle': '', 'instrument_name': '', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-28T18:20:49.006Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-28T18:20:49.006Z', 'user_group': None, 'sample_id_tracking': 'A112', 'nanoparticle_id': '', 'control': '', 'control_id': '', 'date_sample_prep': '', 'sample_volume': '', 'peptide_concentration': '', 'peptide_mass_sample': '', 'dilution_factor': '', 'kit_id': None, 'injection_timestamp': None, 'ms_instrument_sn': None, 'recon_volume': None, 'gradient': None}, {'id': '816a9ed0-15e0-11ee-bdf1-bbaa73585acf', 'sample_id': '803e05b0-15e0-11ee-bdf1-bbaa73585acf', 'raw_file_path': '7ec8cad0-15e0-11ee-bdf1-bbaa73585acf/20230628182044224/TestFile1.raw', 'well_location': 'C11', 'nanoparticle': 'NONE', 'instrument_name': '', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-28T18:20:48.408Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-28T18:20:48.408Z', 'user_group': None, 'sample_id_tracking': 'A111', 'nanoparticle_id': 'NONE', 'control': 'MPE Control', 'control_id': 'MPE Control', 'date_sample_prep': '', 'sample_volume': '20.0', 'peptide_concentration': '59.514', 'peptide_mass_sample': '8.57', 'dilution_factor': '1.0', 'kit_id': None, 'injection_timestamp': None, 'ms_instrument_sn': None, 'recon_volume': None, 'gradient': None}]
There is also an option to return everything as a DataFrame instead:
example = sdk.find_msruns(sample_ids, as_df=True)
log(example) id sample_id raw_file_path well_location nanoparticle instrument_name created_by created_timestamp last_modified_by last_modified_timestamp space sample_id_tracking nanoparticle_id control control_id date_sample_prep sample_volume peptide_concentration peptide_mass_sample dilution_factor kit_id injection_timestamp ms_instrument_sn recon_volume gradient
0 81c6a180-15e0-11ee-bdf1-bbaa73585acf 812139c0-15e0-11ee-bdf1-bbaa73585acf 7ec8cad0-15e0-11ee-bdf1-bbaa73585acf/202306281... D11 04936dea-d255-4130-8e82-2f28938a8f9a 2023-06-28T18:20:49.006Z 04936dea-d255-4130-8e82-2f28938a8f9a 2023-06-28T18:20:49.006Z None A112 None None None None None
1 816a9ed0-15e0-11ee-bdf1-bbaa73585acf 803e05b0-15e0-11ee-bdf1-bbaa73585acf 7ec8cad0-15e0-11ee-bdf1-bbaa73585acf/202306281... C11 NONE 04936dea-d255-4130-8e82-2f28938a8f9a 2023-06-28T18:20:48.408Z 04936dea-d255-4130-8e82-2f28938a8f9a 2023-06-28T18:20:48.408Z None A111 NONE MPE Control MPE Control 20.0 59.514 8.57 1.0 None None None None None
Get Project
Description
Fetches a single project for the authenticated user.
You must provide either a project_id or a project_name—but not both.
If a matching project exists, it is returned.
Params
- project_id (
str, optional): Unique ID of the project to be fetched.
- project_name (
str, optional): Name of the project to be fetched.
Note: Exactly one of
project_idorproject_namemust be provided.
Returns
project: (dict) A project object for the authenticated user.
Examples
Fetch by project_id
#| eval: false
project_id = "222e0890-0f95-11ee-9c0f-3bf27c252a07"
project = sdk.get_project(project_id=project_id)
log(project){'id': '222e0890-0f95-11ee-9c0f-3bf27c252a07', 'project_name': 'sdk test projjjjjj', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-20T18:06:09.397Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-20T18:06:09.397Z', 'space': None}
Fetch by project_name
#| eval: false
project_name = "sdk test projjjjjj"
project = sdk.get_project(project_name=project_name)
log(project){'id': '222e0890-0f95-11ee-9c0f-3bf27c252a07', 'project_name': 'sdk test projjjjjj', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-20T18:06:09.397Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-20T18:06:09.397Z', 'space': None}
Find Projects
Fetches a list of projects. If no project_id is provided, returns all projects. If project_id is provided, returns the project with the given project_id, provided it exists.
The function returns a dict containing DataFrame objects if the as_df flag is passed in as True, otherwise a nested dict object is returned instead.
Params
project_id: (str, optional) Unique ID of the project to be fetched, defaulted toNone.as_df: (bool, optional) whether the result should be converted to a DataFrame, defaulted to False.
Returns
projects: (list[dict] or DataFrame) list[dict] or DataFrame of project objects.
Example
projects = sdk.find_projects()
log(projects)[{'id': '542e36d0-87d6-11ee-b03f-27a54e8c1798', 'project_name': 'test_project', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-11-20T18:55:10.059Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-11-20T18:55:10.059Z', 'space': None}, {'id': '221f0e80-66f7-11ee-abb2-359a84c72f54', 'project_name': 'test_project', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-10-09T22:56:51.066Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-10-09T22:56:51.066Z', 'space': None}]
You can also pass in a specific project_id to specifically fetch a project.
project_id = "222e0890-0f95-11ee-9c0f-3bf27c252a07"
sample_project = sdk.find_projects(project_id)
log(sample_project)[{'id': '222e0890-0f95-11ee-9c0f-3bf27c252a07', 'project_name': 'sdk test projjjjjj', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-06-20T18:06:09.397Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-06-20T18:06:09.397Z', 'space': None}]
Get Analysis Protocol
Fetches a single analysis protocol object. You must provide either analysis_protocol_id or analysis_protocol_name (but not both).
Params
analysis_protocol_id: (str, optional) Unique ID of the analysis protocol to fetch.analysis_protocol_name: (str, optional) Exact name of the analysis protocol to fetch.
Note: Exactly one of
analysis_protocol_idoranalysis_protocol_namemust be provided.
Returns
protocol: (dict) A single analysis protocol object.
Examples
Fetch by ID:
protocol = sdk.get_analysis_protocol(
analysis_protocol_id="dc61a360-6b77-11ed-8ac3-37d35135f08e"
)
log(protocol){'id': 'dc61a360-6b77-11ed-8ac3-37d35135f08e', 'analysis_protocol_name': 'testMSFraggerUpload26', 'analysis_type': 'DDA', 'version_number': '1', 'description': '', 'notes': '', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2022-11-23T21:43:26.142Z', 'parameter_file_path': 'msfragger-parameter/testMSFraggerUpload26.json', 'space': None, 'species': 'Human', 'alg_version': '3.4', 'analysis_engine': 'msfragger', 'scope': 'user'}
Fetch by name:
protocol = sdk.get_analysis_protocol(
analysis_protocol_name="testMSFraggerUpload26"
)
log(protocol){'id': 'dc61a360-6b77-11ed-8ac3-37d35135f08e', 'analysis_protocol_name': 'testMSFraggerUpload26', 'analysis_type': 'DDA', 'version_number': '1', 'description': '', 'notes': '', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2022-11-23T21:43:26.142Z', 'parameter_file_path': 'msfragger-parameter/testMSFraggerUpload26.json', 'space': None, 'species': 'Human', 'alg_version': '3.4', 'analysis_engine': 'msfragger', 'scope': 'user'}
Find Analysis Protocols
Fetches a list of analysis protocols for the authenticated user. If no analysis_id is provided, returns all analysis protocols for the authenticated user. If name (and no analysis_id) is provided, returns the analysis protocol with the given name, provided it exists.
Params
analysis_id: (str, optional) Unique ID of the analysis protocol to be fetched, defaulted to None.name: (str, optional) Name of the analysis protocol to be fetched, defaulted to None.
Returns
protocols: (list[dict]) List of analysis protocol objects for the authenticated user.
Example
analysis_protocols = sdk.find_analysis_protocols()
log(analysis_protocols)[{'id': 'f17bf010-3d55-11ee-8e2e-d304769e96eb', 'analysis_protocol_name': 'first analysis protocol for DIA - DIANN 1.8.1 - Prefect - MBR', 'analysis_type': 'DIA', 'version_number': '1', 'description': None, 'notes': None, 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2023-08-17T23:29:42.870Z', 'parameter_file_path': 'diann-parameter/first_analysis_protocol_for_DIA_-_DIANN_1.8.1_-_Prefect_-_MBR.json', 'space': None, 'species': 'Human', 'alg_version': '1.8.1', 'analysis_engine': 'diann', 'scope': 'user'}, {'id': '10', 'analysis_protocol_name': 'first analysis protocol for DIA - DIANN 1.8.1 - Prefect', 'analysis_type': 'DIA', 'version_number': '1', 'description': None, 'notes': None, 'created_by': 'c7b78248-d7f3-4379-8a33-64853dc427a9', 'created_timestamp': '2023-08-17T18:50:02.504Z', 'parameter_file_path': 'diann-parameter/diann-20230915.json', 'space': None, 'species': 'Human', 'alg_version': '1.8.1', 'analysis_engine': 'diann', 'scope': 'system'}]
You can also pass in a specific name to specifically fetch an analysis protocol.
protocol_name = "testMSFraggerUpload26"
sample_analysis_protocol = sdk.find_analysis_protocols(analysis_protocol_name=protocol_name)
log(sample_analysis_protocol)[{'id': 'dc61a360-6b77-11ed-8ac3-37d35135f08e', 'analysis_protocol_name': 'testMSFraggerUpload26', 'analysis_type': 'DDA', 'version_number': '1', 'description': '', 'notes': '', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2022-11-23T21:43:26.142Z', 'parameter_file_path': 'msfragger-parameter/testMSFraggerUpload26.json', 'space': None, 'species': 'Human', 'alg_version': '3.4', 'analysis_engine': 'msfragger', 'scope': 'user'}]
The same can be done for analysis_id.
protocol_id = "dc61a360-6b77-11ed-8ac3-37d35135f08e"
sample_analysis_protocol = sdk.find_analysis_protocols(analysis_protocol_id=protocol_id)
log(sample_analysis_protocol)[{'id': 'dc61a360-6b77-11ed-8ac3-37d35135f08e', 'analysis_protocol_name': 'testMSFraggerUpload26', 'analysis_type': 'DDA', 'version_number': '1', 'description': '', 'notes': '', 'created_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'created_timestamp': '2022-11-23T21:43:26.142Z', 'parameter_file_path': 'msfragger-parameter/testMSFraggerUpload26.json', 'space': None, 'species': 'Human', 'alg_version': '3.4', 'analysis_engine': 'msfragger', 'scope': 'user'}]
Find Samples
Fetches a list of samples for the authenticated user. Samples can be fetched with respect to a plate_id, project_id, analysis_id, analysis_name.
Params
plate_id: (str, optional) Unique ID of the plate to fetch samples from, defaulted to None.project_id: (str, optional) Unique ID of the project to fetch samples from, defaulted to None.project_name: (str, optional) Name of the project to fetch samples from, defaulted to None.analysis_id: (str, optional) Unique ID of the analysis to fetch samples from, defaulted to None.analysis_name: (str, optional) Name of the analysis to fetch samples from, defaulted to None.as_df: (bool, optional) whether the result should be converted to a DataFrame, defaulted to False.
Returns
samples: (list[dict] or DataFrame) List or DataFrame of sample objects for the authenticated user.
Example
samples = sdk.find_samples(plate_id="303bfc20-87d7-11ee-b03f-27a54e8c1798")
log(samples)[{'id': 'b632546f-d91f-11ef-85ce-51d48ea13827', 'space': None, 'plate_id': 'b50768f0-d91f-11ef-8371-dd35d1cfc768', 'sample_name': 'process control', 'sample_id': 'CPRO', 'sample_type': 'Plasma', 'species': 'Human', 'description': 'Digestion Control S-538-7239/S-431-7226 Lyo PC10 Controls Rep1', 'notes': None, 'created_by': '7485f2ae-9a12-4cc3-8605-98d10e624937', 'created_timestamp': '2025-01-23T00:19:29.584Z', 'last_modified_by': 'c71f7f48-e7f8-4a5f-91f8-8bcdaff4aa82', 'last_modified_timestamp': '2025-06-25T22:54:09.374Z', 'sample_receipt_date': None, 'sample_collection_date': None, 'condition': 'B', 'plate_name': 'NoEdit_Biscayne_Plate_MS1048', 'custom_test': None, 'biological_replicate': None, 'technical_replicate': None, 'custom_new_field': None, 'custom_long_field_1234567890_123456890_1234567890_1234567890_63': None, 'custom_filed': None, 'custom_main_field': None, 'custom_letters_digits_anerscore': None, 'custom_test123': None, 'custom_sample_attribute_for_grouping': None, 'custom_custom_nita_01': None, 'custom_nitafield2': None, 'custom_by_demo_user': None, 'custom_seer_main_staging': None, 'custom_fix_713': None, 'custom_newcustomfield12345': None, 'custom_mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm': None, 'custom_reskin': None, 'custom_newaddedcolumn': None, 'custom_myfirsttestcustomcolumn0817': None, 'custom_sdk_test': None, 'custom_rsun_test_admin_custom_field': None, 'custom_age': None, 'custom_gender': None, 'custom_country': None, 'custom_res2': None, 'custom_region': None, 'custom_nation': None, 'custom_homer_field': None, 'custom_condition_1': None, 'well_location': 'B11', 'control': 'CPRO'},
{'id': 'b6325471-d91f-11ef-85ce-51d48ea13827', 'space': None, 'plate_id': 'b50768f0-d91f-11ef-8371-dd35d1cfc768', 'sample_name': 'cleanup control', 'sample_id': 'CCLN', 'sample_type': 'Peptide', 'species': 'Human', 'description': 'NONE Lyo PC10 Peptides Controls Rep1', 'notes': None, 'created_by': '7485f2ae-9a12-4cc3-8605-98d10e624937', 'created_timestamp': '2025-01-23T00:19:29.584Z', 'last_modified_by': '7485f2ae-9a12-4cc3-8605-98d10e624937', 'last_modified_timestamp': '2025-01-23T00:19:29.852Z', 'sample_receipt_date': None, 'sample_collection_date': None, 'condition': 'B', 'plate_name': 'NoEdit_Biscayne_Plate_MS1048', 'custom_test': None, 'biological_replicate': None, 'technical_replicate': None, 'custom_new_field': None, 'custom_long_field_1234567890_123456890_1234567890_1234567890_63': None, 'custom_filed': None, 'custom_main_field': None, 'custom_letters_digits_anerscore': None, 'custom_test123': None, 'custom_sample_attribute_for_grouping': None, 'custom_custom_nita_01': None, 'custom_nitafield2': None, 'custom_by_demo_user': None, 'custom_seer_main_staging': None, 'custom_fix_713': None, 'custom_newcustomfield12345': None, 'custom_mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm': None, 'custom_reskin': None, 'custom_newaddedcolumn': None, 'custom_myfirsttestcustomcolumn0817': None, 'custom_sdk_test': None, 'custom_rsun_test_admin_custom_field': None, 'custom_age': None, 'custom_gender': None, 'custom_country': None, 'custom_res2': None, 'custom_region': None, 'custom_nation': None, 'custom_homer_field': None, 'custom_condition_1': None, 'well_location': 'D11', 'control': 'cleanup control'}]
Get Analysis
Fetches a single analysis object. You must provide either an analysis_id or an analysis_name (but not both).
Params
- analysis_id: (str, optional) Unique ID of the analysis to fetch.
- analysis_name: (str, optional) Exact name of the analysis to fetch.
Note: Exactly one of analysis_id or analysis_name must be provided.
Returns
analysis: (dict) A single analysis object.
Examples
analysis = sdk.get_analysis(analysis_id="00425a60-a850-11ea-b3d7-0171d11d0807")
log(analysis)
```{python}
#| eval: false
analysis = sdk.get_analysis(analysis_name="MYProject - 4 SAMPLE - NEW")
log(analysis){'id': '00425a60-a850-11ea-b3d7-0171d11d0807', 'analysis_name': 'MYPROJECT - 4 SAMPLE - NEW', 'description': 'DO NOT DELETE - FOR VISUALS', 'notes': '', 'analysis_protocol_id': 'b07d70f0-5471-11ea-90e8-6743ed53f19f', 'analyzed_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'start_time': '2020-06-06T23:46:54.360Z', 'end_time': '2020-06-07T03:23:30.614Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2020-06-06T23:46:54.360Z', 'status': 'SUCCEEDED', 'result_folder': '2d89db50-805b-11ea-a3cb-5341c25962a0/00425a60-a850-11ea-b3d7-0171d11d0807/', 'job_id': '1c7dcdb9-21fe-47a3-9f9c-5326cc9662b2', 'space': None, 'project_id': '2d89db50-805b-11ea-a3cb-5341c25962a0', 'number_msdatafile': '20', 'protein_group_count': 3641, 'single_protein_group_count': 3184, 'possible_protein_set_size': 3878, 'peptides_count': 32274, 'contains_control': False, 'job_log_stream_name': None, 'contains_sample': True, 'is_folder': False, 'folder_id': None, 'number_sample': None, 'total_file_size_mb': None, 'msdatafile_extensions': None}
Find Analyses
Returns a list of analyses objects for the authenticated user. If no analysis_id is provided, returns all analyses for the authenticated user.
Params
analysis_id: (str, optional) Unique ID of the analysis to be fetched, defaulted to None.
Returns
analyses: (dict) Contains a list of analyses objects for the authenticated user.
Example
analyses = sdk.find_analyses()
log(analyses)[{'id': 'c80a0260-856e-11ee-b1b7-ff20dde1461b', 'analysis_name': '11172023-dia-test-0419 reanalysis', 'description': '', 'notes': '', 'analysis_protocol_id': 'cb053660-a48a-11ea-bcfb-f7528a88c755', 'analyzed_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'start_time': '2023-11-17T17:28:54.451Z', 'end_time': '2023-11-17T17:49:23.492Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-11-17T17:28:54.451Z', 'status': 'SUCCEEDED', 'result_folder': 'e46af6a0-7a11-11eb-882e-1719d79479fd/c80a0260-856e-11ee-b1b7-ff20dde1461b/', 'job_id': 'eac28b3d-f6ba-4644-bce6-6007756d9af7', 'space': None, 'project_id': 'e46af6a0-7a11-11eb-882e-1719d79479fd', 'number_msdatafile': '5', 'protein_group_count': 827, 'single_protein_group_count': 562, 'possible_protein_set_size': None, 'peptides_count': 8281, 'contains_control': False, 'job_log_stream_name': 'encyclopedia_group/default/3a364a127b5d4cc5acc348d5d1da3b21', 'contains_sample': True, 'is_folder': False, 'folder_id': None, 'number_sample': 1, 'total_file_size_mb': 5806, 'msdatafile_extensions': '.wiff', 'analysis_protocol_name': 'DIA - first analysis protocol', 'analysis_type': 'DIA', 'plate_id': ['adbc28a0-04ea-11eb-9581-2ddecac39483']}, {'id': 'fd62a3b0-834b-11ee-a7fd-912ae102163f', 'analysis_name': '11142023-1 - 1028-3 reanalysis 333111 reanalysis 44122 reanalysis', 'description': '', 'notes': '', 'analysis_protocol_id': '4', 'analyzed_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'start_time': '2023-11-15T00:14:49.243Z', 'end_time': '2023-11-15T00:59:22.703Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-11-15T00:14:49.243Z', 'status': 'SUCCEEDED', 'result_folder': 'bdf855d0-9beb-11ea-810d-31caeb439dfe/fd62a3b0-834b-11ee-a7fd-912ae102163f/', 'job_id': 'b94f7dfb-a44e-4df0-9581-fa0ce43eaaf2', 'space': None, 'project_id': 'bdf855d0-9beb-11ea-810d-31caeb439dfe', 'number_msdatafile': '6', 'protein_group_count': 775, 'single_protein_group_count': None, 'possible_protein_set_size': None, 'peptides_count': 5514, 'contains_control': False, 'job_log_stream_name': 'proteogenomics_group/default/6efe18275eee455a911f0db0e558a598', 'contains_sample': True, 'is_folder': False, 'folder_id': None, 'number_sample': 1, 'total_file_size_mb': 3036, 'msdatafile_extensions': '.raw', 'analysis_protocol_name': 'second analysis protocol for proteogenomics', 'analysis_type': 'PROTEOGENOMICS', 'plate_id': ['0bb11350-8ca8-11ea-911b-f9d607852407']}, {'id': 'a8c51390-8334-11ee-a7fd-912ae102163f', 'analysis_name': '11142023-dia-test-0419 reanalysis', 'description': '', 'notes': '', 'analysis_protocol_id': 'cb053660-a48a-11ea-bcfb-f7528a88c755', 'analyzed_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'start_time': '2023-11-14T21:27:48.956Z', 'end_time': '2023-11-14T21:47:39.113Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2023-11-14T21:27:48.956Z', 'status': 'SUCCEEDED', 'result_folder': 'e46af6a0-7a11-11eb-882e-1719d79479fd/a8c51390-8334-11ee-a7fd-912ae102163f/', 'job_id': '81bfeb1c-bdba-4adb-9d04-0730b06638a6', 'space': None, 'project_id': 'e46af6a0-7a11-11eb-882e-1719d79479fd', 'number_msdatafile': '5', 'protein_group_count': 827, 'single_protein_group_count': 562, 'possible_protein_set_size': None, 'peptides_count': 8281, 'contains_control': False, 'job_log_stream_name': 'encyclopedia_group/default/caadabba7bfa4a42ae950506768d40c1', 'contains_sample': True, 'is_folder': False, 'folder_id': None, 'number_sample': 1, 'total_file_size_mb': 5806, 'msdatafile_extensions': '.wiff', 'analysis_protocol_name': 'DIA - first analysis protocol', 'analysis_type': 'DIA', 'plate_id': ['adbc28a0-04ea-11eb-9581-2ddecac39483']}]
You can also pass in a specific analysis_id to specifically fetch an analysis.
analysis_id = "00425a60-a850-11ea-b3d7-0171d11d0807"
sample_analysis = sdk.find_analyses(analysis_id)
log(sample_analysis)[{'id': '00425a60-a850-11ea-b3d7-0171d11d0807', 'analysis_name': 'MYPROJECT - 4 SAMPLE - NEW', 'description': 'DO NOT DELETE - FOR VISUALS', 'notes': '', 'analysis_protocol_id': 'b07d70f0-5471-11ea-90e8-6743ed53f19f', 'analyzed_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'start_time': '2020-06-06T23:46:54.360Z', 'end_time': '2020-06-07T03:23:30.614Z', 'last_modified_by': '04936dea-d255-4130-8e82-2f28938a8f9a', 'last_modified_timestamp': '2020-06-06T23:46:54.360Z', 'status': 'SUCCEEDED', 'result_folder': '2d89db50-805b-11ea-a3cb-5341c25962a0/00425a60-a850-11ea-b3d7-0171d11d0807/', 'job_id': '1c7dcdb9-21fe-47a3-9f9c-5326cc9662b2', 'space': None, 'project_id': '2d89db50-805b-11ea-a3cb-5341c25962a0', 'number_msdatafile': '20', 'protein_group_count': 3641, 'single_protein_group_count': 3184, 'possible_protein_set_size': 3878, 'peptides_count': 32274, 'contains_control': False, 'job_log_stream_name': None, 'contains_sample': True, 'is_folder': False, 'folder_id': None, 'number_sample': None, 'total_file_size_mb': None, 'msdatafile_extensions': None}]
Get Search Result
Given an analysis_id, this function returns the indicated search result file in the form of a DataFrame, if the input is valid and the search has succeeded.
In case the search has failed for the given analysis_id, the function will return a ValueError.
The following files are available: - protein_group_np.tsv - protein_group_panel.tsv - peptide_np.tsv - peptide_panel.tsv - report.tsv , when analyte_type=‘precursor’ and rollup=‘np’
Params
analysis_id: (str) Unique ID of the analysis for which the data is to be fetched.analyte_type: (str) Type of analyte to be fetched. Must be one ofprotein,peptide, orprecursor.rollup: (str) Type of rollup to be fetched. Must be one ofnp(for nanoparticle) orpanel(for panel).- If
analyte_typeisprecursor, thenrollupmust benp. Defaults tonp.
- If
Returns
links: (dict) Contains DataFrame objects for the analysis_id, given that the search has been complete.
Example
If you want the result as DataFrame objects:
analysis_id = "ddff8c40-0493-11ee-bd19-a77197cd1a6b"
analysis_data = sdk.get_search_result(analysis_id, analyte_type='protein', rollup='np')
log_df(analysis_data)File Name Sample Name Plate ID Well Nanoparticle Protein Group Intensity (Log10) Normalized Intensity (Log10) Protein Names Gene Names Biological Process Molecular Function Cellular Component
0 MYPROJECT_X4534_A_Orbitrap-1_2ug_60min.raw Test-Sample-112A MYPROJECT B1 SP-0333 P56181-2 8.698 0.030 Isoform of P56181, Isoform 2 of NADH dehydroge... NDUFV3 NaN NaN NaN
1 MYPROJECT_X4533_A_Orbitrap-1_2ug_60min.raw Test-Sample-112A MYPROJECT B2 SP-0334 P56181-2 8.294 -0.525 Isoform of P56181, Isoform 2 of NADH dehydroge... NDUFV3 NaN NaN NaN
2 MYPROJECT_X4541_A_Orbitrap-1_2ug_60min_20190829... Test-Sample-110A MYPROJECT A4 SP-0336 P56181-2 8.780 -0.156 Isoform of P56181, Isoform 2 of NADH dehydroge... NDUFV3 NaN NaN NaN
3 MYPROJECT_X4540_A_Orbitrap-1_2ug_60min_20190829... Test-Sample-110A MYPROJECT A5 SP-0337 P56181-2 8.867 0.137 Isoform of P56181, Isoform 2 of NADH dehydroge... NDUFV3 NaN NaN NaN
4 MYPROJECT_X4544_A_Orbitrap-1_2ug_60min_20190829... Test-Sample-110A MYPROJECT A3 SP-0335 P56181-2 9.837 0.992 Isoform of P56181, Isoform 2 of NADH dehydroge... NDUFV3 NaN NaN NaN
Download Search Output File
Downloads an indicated search output file for the given analysis_id to the specified download_path. If no download_path is specified or the download_path is invalid, the file will be downloaded to the current working directory.
Params
analysis_id: (str) Unique ID of the analysis for which the data is to be downloaded.filename: (str) Name of the file to be downloaded. Filename can be case insensitive and file extension optional.download_path: (str) Path to download the analysis file to, defaulted to current working directory.
Returns
file : (str) File path of the downloaded file.
Example
analysis_id = "ddff8c40-0493-11ee-bd19-a77197cd1a6b"
filename = "protein_group_np.tsv"
Progress: 59.0MB [00:02, 21.5MB/s]
sdk.download_search_output_file(analysis_id, filename, download_path="testing/")Downloading file: Protein_Group_NP.tsv
File Protein_Group_NP.tsv downloaded successfully to testing/Protein_Group_NP.tsv
'testing/Protein_Group_NP.tsv'
List Search Output Files
Returns a list of search output files for the given analysis_id.
If the analysis_id is invalid or the search has not been completed, an error is raised.
Params
analysis_id: (str) Unique ID of the analysis for which the data is to be listed.
folder: (str, optional) Root folder key to list search result files from. Defaults to None and displays search output files from the top level.
recursive: (bool, optional) Whether to list files recursively from all subfolders. Defaults to False.
Returns
files: (list[str]) A list of search output files for the given analysis_id.
Example
analysis_id = "ddff8c40-0493-11ee-bd19-a77197cd1a6b"
search_output_files = sdk.list_search_output_files(analysis_id)
print(search_output_files)
search_output_files = sdk.list_search_output_files(analysis_id, folder='rollup')
print(search_output_files)['protein_group_np.tsv', 'protein_group_panel.tsv', 'peptide_np.tsv', 'peptide_panel.tsv', 'report.tsv', 'rollup/', 'rmbatch/']
['rollup/seer_proteingroup.intensity.parquet', 'rollup/seer_proteingroup.median_normalized_intensity.parquet', 'rollup/seer_np_peptide.engine_normalized_intensity.parquet', 'rollup/seer_np_peptide.intensity.parquet', 'rollup/seer_np_peptide.median_normalized_intensity.parquet', 'rollup/seer_peptide.mediandense_normalized_intensity.parquet', 'rollup/seer_peptide.engine_normalized_intensity.parquet', 'rollup/seer_peptide.intensity.parquet', 'rollup/seer_peptide.median_normalized_intensity.parquet']
Get Search Output File URL
Returns a pre-signed URL for downloading a search output file for the given analysis_id and filename.
If the analysis_id or filename is invalid, an error is raised.
Params
analysis_id: (str) Unique ID of the analysis for which the data is to be fetched.filename: (str) Name of the file to be fetched.
Returns
url: (str) A pre-signed URL for downloading the search output file.
Example
analysis_id = "ddff8c40-0493-11ee-bd19-a77197cd1a6b"
filename = "protein_group_np.tsv"
url = sdk.get_search_output_file_url(analysis_id, filename)
print(url)https://example-bucket.s3.region.amazonaws.com/foobarbaz%20filename%3DProtein_Group_NP.tsv...
Analysis Complete
Returns the status of the analysis with the given analysis_id.
Params
analysis_id: (str) Unique ID of the analysis.
Returns
res: (dict) A dictionary containing the status of the analysis.
Example
analysis_id = "ddff8c40-0493-11ee-bd19-a77197cd1a6b"
log(sdk.analysis_complete(analysis_id)){'status': 'SUCCEEDED'}
List MS Data Files
Lists all the MS data files in the given folder as long as the folder path passed in the params is valid.
Params
folder: (str, optional) Folder path to list the files from. Defaults to an empty string and displays all files for the user.space: (str, optional) ID of the user group to which the files belong, defaulted to None.
Returns
(list[str]) Contains the list of files in the folder.
Example
folder_path = "2bbdac30-66f7-11ee-abb2-359a84c72f54/20231009225707449"
log(sdk.list_ms_data_files(folder_path))['2bbdac30-66f7-11ee-abb2-359a84c72f54/20231009225707449/plateMap_2bbdac30-66f7-11ee-abb2-359a84c72f54.csv', '2bbdac30-66f7-11ee-abb2-359a84c72f54/20231009225707449/TestFile1.raw', '2bbdac30-66f7-11ee-abb2-359a84c72f54/20231009225707449/TestFile2.raw']
If the folder path is invalid, the result is an empty list.
folder_path = "some/invalid/path"
log(sdk.list_ms_data_files(folder_path))[]
Download MS Data Files
Downloads all MS data files for paths passed in the params to the specified download_path.
If no download_path is specified or the download_path is invalid, the file will be downloaded to the current working directory.
Params
paths: (list[str]) List of paths to download.download_path: (str, optional) Path to download the analysis file to, defaulted to current working directory.space: (str, optional) ID of the user group to which the file belongs, defaulted to None.
Returns
(list[str]) Contains the list of files downloaded to the specified path.
Example
download_paths = ["2bbdac30-66f7-11ee-abb2-359a84c72f54/20231009225707449/TestFile1.raw", "2bbdac30-66f7-11ee-abb2-359a84c72f54/20231009225707449/TestFile2.raw"]
log(sdk.download_ms_data_files(paths=download_paths, download_path="testing/"))Downloading files to "testing"
Downloading TestFile1.raw
Finished downloading TestFile1.raw
Downloading TestFile2.raw
Finished downloading TestFile2.raw
['testing/TestFile1.raw', 'testing/TestFile2.raw']
Protein Results Table
Returns the protein results table for given analysis_id or analysis_name.
Params
analysis_id: (str, optional) The analysis ID, defaulted to None.analysis_name: (str, optional) The analysis name, defaulted to None.grouping: (‘str’, optional): group criteria of table result. Defaults to “condition”.as_df: (bool, optional) Whether the result should be converted to a DataFrame, defaulted to False.
Returns
- res: (
list[dict]orDataFrame) A list of dictionaries or DataFrame containing the protein results table.
Example
analysis_id = "c4089c00-16ab-11ec-b589-634014ca2005"
print(sdk.get_protein_results_table(analysis_id))
print(sdk.get_protein_results_table(analysis_id, as_df=True))[{'uniprot_id': 'Q9HD20',
'gene_name': 'ATP13A1',
'coverage': 0.04983388704318937,
'n_samples': 20,
'median': 5.2130235974326204,
'fraction_samples': 1.0,
'protein_name': 'Endoplasmic reticulum transmembrane helix translocase (EC 7.4.2.-) (Endoplasmic reticulum P5A-ATPase)',
'biological_process': 'extraction of mislocalized protein from ER membrane [GO:0140569]; intracellular calcium ion homeostasis [GO:0006874]; monoatomic ion transmembrane transport [GO:0034220]; protein transport [GO:0015031]; transmembrane transport [GO:0055085]',
'molecular_function': 'ABC-type manganese transporter activity [GO:0015410]; ATP binding [GO:0005524]; ATP hydrolysis activity [GO:0016887]; ATPase-coupled monoatomic cation transmembrane transporter activity [GO:0019829]; membrane protein dislocase activity [GO:0140567]; metal ion binding [GO:0046872]; P-type ion transporter activity [GO:0015662]',
'cellular_component': 'endoplasmic reticulum membrane [GO:0005789]; membrane [GO:0016020]'},
{'uniprot_id': 'Q9UL25',
'gene_name': 'RAB21',
'coverage': 0.4,
'n_samples': 20,
'median': 5.455800694428291,
'fraction_samples': 1.0,
'protein_name': 'Ras-related protein Rab-21 (EC 3.6.5.2)',
'biological_process': 'anterograde axonal transport [GO:0008089]; intracellular protein transport [GO:0006886]; positive regulation of dendrite morphogenesis [GO:0050775]; positive regulation of early endosome to late endosome transport [GO:2000643]; positive regulation of receptor-mediated endocytosis [GO:0048260]; protein stabilization [GO:0050821]; Rab protein signal transduction [GO:0032482]; regulation of axon extension [GO:0030516]; regulation of exocytosis [GO:0017157]',
'molecular_function': 'GDP binding [GO:0019003]; GTP binding [GO:0005525]; GTPase activity [GO:0003924]',
'cellular_component': 'axon cytoplasm [GO:1904115]; cleavage furrow [GO:0032154]; cytoplasmic side of early endosome membrane [GO:0098559]; cytoplasmic side of plasma membrane [GO:0009898]; cytosol [GO:0005829]; early endosome [GO:0005769]; early endosome membrane [GO:0031901]; endomembrane system [GO:0012505]; endoplasmic reticulum membrane [GO:0005789]; endosome [GO:0005768]; extracellular exosome [GO:0070062]; focal adhesion [GO:0005925]; Golgi cisterna membrane [GO:0032580]; Golgi membrane [GO:0000139]; synapse [GO:0045202]; trans-Golgi network [GO:0005802]; vesicle membrane [GO:0012506]'}]
uniprot_id gene_name coverage n_samples median fraction_samples protein_name biological_process molecular_function cellular_component
0 Q9HD20 ATP13A1 0.049834 20 5.213024 1.0 Endoplasmic reticulum transmembrane helix translocase (EC 7.4.2.-) (Endoplasmic reticulum P5A-ATPase) extraction of mislocalized protein from ER membrane [GO:0140569]; intracellular calcium ion homeostasis [GO:0006874]; monoatomic ion transmembrane transport [GO:0034220]; protein transport [GO:0015031]; transmembrane transport [GO:0055085] ABC-type manganese transporter activity [GO:0015410]; ATP binding [GO:0005524]; ATP hydrolysis activity [GO:0016887]; ATPase-coupled monoatomic cation transmembrane transporter activity [GO:0019829]; membrane protein dislocase activity [GO:0140567]; metal ion binding [GO:0046872]; P-type ion transporter activity [GO:0015662] endoplasmic reticulum membrane [GO:0005789]; membrane [GO:0016020]
1 Q9UL25 RAB21 0.400000 20 5.455801 1.0 Ras-related protein Rab-21 (EC 3.6.5.2) anterograde axonal transport [GO:0008089]; intracellular protein transport [GO:0006886]; positive regulation of dendrite morphogenesis [GO:0050775]; positive regulation of early endosome to late endosome transport [GO:2000643]; positive regulation of receptor-mediated endocytosis [GO:0048260]; protein stabilization [GO:0050821]; Rab protein signal transduction [GO:0032482]; regulation of axon extension [GO:0030516]; regulation of exocytosis [GO:0017157] GDP binding [GO:0019003]; GTP binding [GO:0005525]; GTPase activity [GO:0003924] axon cytoplasm [GO:1904115]; cleavage furrow [GO:0032154]; cytoplasmic side of early endosome membrane [GO:0098559]; cytoplasmic side of plasma membrane [GO:0009898]; cytosol [GO:0005829]; early endosome [GO:0005769]; early endosome membrane [GO:0031901]; endomembrane system [GO:0012505]; endoplasmic reticulum membrane [GO:0005789]; endosome [GO:0005768]; extracellular exosome [GO:0070062]; focal adhesion [GO:0005925]; Golgi cisterna membrane [GO:0032580]; Golgi membrane [GO:0000139]; synapse [GO:0045202]; trans-Golgi network [GO:0005802]; vesicle membrane [GO:0012506]
Get Peptide Results Table
Returns the peptide results table for given analysis_id or analysis_name.
Params
analysis_id: (str, optional) The analysis ID, defaulted to None.analysis_name: (str, optional) The analysis name, defaulted to None.grouping: (‘str’, optional): group criteria of table result. Defaults to “condition”.as_df: (bool, optional) Whether the result should be converted to a DataFrame, defaulted to False.
Returns
- res: (
list[dict]orDataFrame) A list of dictionaries or DataFrame containing the peptide results table.
Example
analysis_id = "c4089c00-16ab-11ec-b589-634014ca2005"
print(sdk.get_peptide_results_table(analysis_id))
print(sdk.get_peptide_results_table(analysis_id, as_df=True))[{'peptide': 'DTEEEDFHVDQATTVK',
'uniprot_id': 'A0A024R6I7;A0A0G2JRN3',
'gene_name': 'SERPINA1;SERPINA1',
'n_samples': 20,
'median': 6.7500568241225345,
'fraction_samples': 1.0,
'protein_name': 'deleted;Serpin family A member 1',
'biological_process': ';',
'molecular_function': ';serine-type endopeptidase inhibitor activity [GO:0004867]',
'cellular_component': ';extracellular space [GO:0005615]'},
{'peptide': 'EIVMTQSPPTLSLSPGER',
'uniprot_id': 'A0A075B6H7',
'gene_name': 'IGKV3',
'n_samples': 20,
'median': 6.257082970753563,
'fraction_samples': 1.0,
'protein_name': 'Probable non-functional immunoglobulin kappa variable 3-7',
'biological_process': 'adaptive immune response [GO:0002250]; immune response [GO:0006955]',
'molecular_function': '',
'cellular_component': 'extracellular region [GO:0005576]; immunoglobulin complex [GO:0019814]; plasma membrane [GO:0005886]'}]
peptide uniprot_id gene_name n_samples median fraction_samples protein_name biological_process molecular_function cellular_component
0 DTEEEDFHVDQATTVK A0A024R6I7;A0A0G2JRN3 SERPINA1;SERPINA1 20 6.750057 1.0 deleted;Serpin family A member 1 ; ;serine-type endopeptidase inhibitor activity [GO:0004867] ;extracellular space [GO:0005615]
1 EIVMTQSPPTLSLSPGER A0A075B6H7 IGKV3 20 6.257083 1.0 Probable non-functional immunoglobulin kappa variable 3-7 adaptive immune response [GO:0002250]; immune response [GO:0006955] ; extracellular region [GO:0005576]; immunoglobulin complex [GO:0019814]; plasma membrane [GO:0005886]
Group Analysis Results
Returns the group analysis data for given analysis_id (provided it is valid and the group analysis has been successful) and box_plot config info.
Params
analysis_id: (str) The analysis ID.box_plot: (dict) The box plot configuration needed for the analysis, defaulted toNone. Containsfeature_type(“protein” or “peptide”) andfeature_ids(comma separated list of feature IDs) keys.
Returns
- res: (
dict[str, list]) A dictionary containing the group analysis data.
Example
When no box_plotconfig info is specified, the results are as follows. Notice that there is no box plot key in the result dictionary.
group_analysis_id = "c4089c00-16ab-11ec-b589-634014ca2005"
log(sdk.group_analysis_results(group_analysis_id))However when valid box plot information is declared as below, the result will contain a box_plot key.
box_plot_info = {
"feature_type": "protein",
"feature_ids": ["Q96RL7-2"]
}
log(sdk.group_analysis_results(group_analysis_id, box_plot_info))Download Analysis Protocol FASTA
Downloads the FASTA file(s) associated with an analysis protocol. You can specify an analysis_id (the function will resolve the protocol automatically) or provide an analysis_protocol_id directly.
Params
analysis_protocol_id: (str, optional) ID of the analysis protocol whose FASTA file(s) you want.analysis_id: (str, optional) ID of the analysis whose protocol FASTA file(s) you want.download_path: (str, optional) Directory to save files to. Defaults to the current working directory.
Note: Provide either analysis_id or analysis_protocol_id (but not both).
Returns
- files: (
list[str]) List of file paths where the FASTA files were downloaded, if downloading.
Examples
Download by analysis ID (files saved to current directory):
sdk.download_analysis_protocol_fasta(
analysis_id="c80a0260-856e-11ee-b1b7-ff20dde1461b"
)Download by analysis protocol ID to a specific folder:
sdk.download_analysis_protocol_fasta(
analysis_protocol_id="dc61a360-6b77-11ed-8ac3-37d35135f08e",
download_path="/path/to/fasta"
)['/path/to/fasta/uniprot_human_2023_08.fasta', '/path/to/fasta/contaminants.fasta']
Get Analysis Protocol FASTA link
Returns signed download links for the FASTA file(s) associated with an analysis protocol. You can specify an analysis_id (the function will resolve the protocol automatically) or provide an analysis_protocol_id directly.
Params
analysis_protocol_id: (str, optional) ID of the analysis protocol whose FASTA file(s) you want.analysis_id: (str, optional) ID of the analysis whose protocol FASTA file(s) you want. Note: Provide either analysis_id or analysis_protocol_id (but not both).
Returns
- links: (
list[dict]) List of dictionaries containing filename and signed URL for each FASTA file.
Examples
Get by analysis ID:
sdk.get_analysis_protocol_fasta_link(
analysis_id="c80a0260-856e-11ee-b1b7-ff20dde1461b"
)[
{"filename": "uniprot_human_2023_08.fasta", "url": "https://...signed..."},
{"filename": "contaminants.fasta", "url": "https://...signed..."}
]
Get by analysis protocol ID:
sdk.get_analysis_protocol_fasta_link(
analysis_protocol_id="dc61a360-6b77-11ed-8ac3-37d35135f08e"
)[
{"filename": "uniprot_human_2023_08.fasta", "url": "https://...signed..."},
{"filename": "contaminants.fasta", "url": "https://...signed..."}
]