Ingest¶

This guide explains how data from the Covim Biosample metadata form can be ingested into the instance.

import lamindb as ln
import bionty as bt
import lnschema_covim as cv
import pandas as pd

→ connected lamindb: anonymous/testdata

Load the Excel form¶

# We only use the first row for demonstration purposes
df = pd.read_excel("../../CovimBiosampleForm.xlsx", sheet_name="Example").head(1)
df

	institute_abbreviation	name	internal_id	files	study_id	study_name	description	methodology	comments	affiliations	doi	authors	sample_id	sample_type	sample_type_comments	date_of_sample_collection	timepoint	donor_id	sex	age	data_modality	experiment_type	instrument_type	protocol_details	sequencing_platform	alignment_pipeline	reference_genome	specificity	experiment_description	group	primary_diagnosis	pcr_test_result	date_of_pcr_test	days_post_symptom_onset	days_post_icu_admission	severity	maximal_severity	severity_criteria	disease_phase	vaccination_status	date_of_vaccination	immunocompromised_class	immunocompromised_condition	immunocompromised_category	year_last_transplant	secondary_infections	comorbidities	medications	status_on_release
0	Site_A	Mock 1	1206	Bonn_Mockfile_1.fcs\nBonn_Mockfile_2.fcs\nBonn...	COVID-19-IMMUNE-02	A COVID-19 study on smoked mice.	This study examines the effect of smoking on m...	interventional study	The study was done with our new mice.	DZNE	10.1038/s41576-023-00586-w	Einstein, A.; Theis J., F.	SAMPLE-COVID-19-SMOKE-001	plasma	Has a bunch of weird tissue.	2025-01-03 00:00:00	2	SMO-420	male	2	single-cell sequencing	RNA sequencing	iSeq 100	Standard 10X	10X Chromium	BWA	GRCh38	MALAT1	Just some experiment where we put mice subject...	CON	U07.2	negative	2024-02-01	20	2.0	1	2	NIH	convalescent	1	2024-02-06	immunodeficiency_transplant	kidney transplantation	Z94.0	2014	A01.0\nA02.1	U07.1\nI51.9\nK83.9	A01AC02\nA01AC03	recovered

cv.list_files_from_biosample_form(df)

✗ Couldn't find the following 3 files:
   /home/runner/work/lnschema-covim/lnschema-covim/docs/guide/testdata/Bonn_Mockfile_1.fcs
   /home/runner/work/lnschema-covim/lnschema-covim/docs/guide/testdata/Bonn_Mockfile_2.fcs
   /home/runner/work/lnschema-covim/lnschema-covim/docs/guide/testdata/Bonn_Mockfile_3.fcs

   → please pass the correct `basedir`
   → or modify the `files` column in biosample form!

Pass the correct basedir if relative paths are specified in the biosample form:

basedir = "../../lnschema_covim/datasets"
cv.list_files_from_biosample_form(df, basedir=basedir)

['../../lnschema_covim/datasets/Bonn_Mockfile_1.fcs',
 '../../lnschema_covim/datasets/Bonn_Mockfile_2.fcs',
 '../../lnschema_covim/datasets/Bonn_Mockfile_3.fcs']

Perform ingestion¶

Here we show how to ingest files from a mock biosample and link them to the metadata fields.

# Track the current notebook
# Run ln.track() to generate the tracking id
ln.track()

→ created Transform('Y4f1ITkdkxo80000'), started new Run('uIdOIwaM...') at 2025-06-20 11:35:32 UTC

→ notebook imports: bionty==1.5.0 lamindb==1.6.2 lnschema_covim==0.1.3 pandas==2.3.0

• recommendation: to identify the notebook across renames, pass the uid: ln.track("Y4f1ITkdkxo8")

# Save any new values that are not in the registry yet
cv.Comorbidity(ontology_id="U07.2", name="COVID-19, virus not identified").save()

Comorbidity(uid='3587tO5t', ontology_id='U07.2', name='COVID-19, virus not identified', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:33 UTC)

# optionally: pass a custom curate_function
biosample = cv.ingest_markers(
    meta=df,
    basedir=basedir,
)
biosample

→ creating relationship records...

! calling anonymously, will miss private instances

→ source added!

! record with similar name exists! did you mean to load it?

	uid	ontology_id	name	space_id	source_id	run_id	created_at	created_by_id	_aux	branch_id
id
1	3587tO5t	U07.2	COVID-19, virus not identified	1	None	1	2025-06-20 11:35:33.272000+00:00	1	None	1

→ creating curated files records...

! using default organism = human

! using default organism = human

! using default organism = human

! 4 terms not validated in feature 'marker' in slot 'var': 'CD14', 'CD19', 'CD4', 'HLA-DR'
    4 synonyms found: "CD14" → "Cd14", "CD19" → "Cd19", "CD4" → "Cd4", "HLA-DR" → "HLADR"
    → curate synonyms via: .standardize("marker")

! using default organism = human

! using default organism = human

! using default organism = human

→ returning existing schema with same hash: Schema(uid='7Be8O9pNRZNdgzsM', n=1, is_type=False, itype='Feature', hash='LaIQq6vJLW1jLqWNdjfMIw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

! using default organism = human

! using default organism = human

! using default organism = human

! 4 terms not validated in feature 'marker' in slot 'var': 'CD14', 'CD19', 'CD4', 'HLA-DR'
    4 synonyms found: "CD14" → "Cd14", "CD19" → "Cd19", "CD4" → "Cd4", "HLA-DR" → "HLADR"
    → curate synonyms via: .standardize("marker")

! using default organism = human

! using default organism = human

! using default organism = human

→ returning existing schema with same hash: Schema(uid='7Be8O9pNRZNdgzsM', n=1, is_type=False, itype='Feature', hash='LaIQq6vJLW1jLqWNdjfMIw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

! using default organism = human

! using default organism = human

! using default organism = human

! 4 terms not validated in feature 'marker' in slot 'var': 'CD14', 'CD19', 'CD4', 'HLA-DR'
    4 synonyms found: "CD14" → "Cd14", "CD19" → "Cd19", "CD4" → "Cd4", "HLA-DR" → "HLADR"
    → curate synonyms via: .standardize("marker")

! using default organism = human

! using default organism = human

! using default organism = human

→ returning existing schema with same hash: Schema(uid='7Be8O9pNRZNdgzsM', n=1, is_type=False, itype='Feature', hash='LaIQq6vJLW1jLqWNdjfMIw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

→ creating biosample record...

Biosample(uid='38gDDpfmHy9B', name='Mock 1', internal_id='1206', institute_abbreviation='Site_A', sample_id='hidden', sample_type='plasma', sample_type_comments='Has a bunch of weird tissue.', data_modality='single-cell sequencing', experiment_type='RNA sequencing', instrument_type='iSeq 100', protocol_details='Standard 10X', sequencing_platform='10X Chromium', alignment_pipeline='BWA', reference_genome='GRCh38', detection_method='not applicable', specificity='MALAT1', experiment_description='Just some experiment where we put mice subject to smoke.', donor_uid='7LjRnBdYD0BA', donor_id='hidden', sex='male', age=2, immunocompromised_class='immunodeficiency_transplant', immunocompromised_condition='kidney transplantation', immunocompromised_category=Comorbidity(uid='5ab5fyj4', ontology_id='Z94.0', name='Kidney transplant status', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC), year_last_transplant='2014', date_of_sample_collection=2025-01-03, timepoint=2, group='CON', pcr_test_result='negative', date_of_pcr_test=2024-02-01, days_post_symptom_onset=20, days_post_icu_admission=2, severity=1, maximal_severity=2, severity_criteria='NIH', disease_phase='convalescent', vaccination_status=1, date_of_vaccination=2024-02-06, status_on_release='recovered', branch_id=1, space_id=1, created_by_id=1, run_id=1, primary_diagnosis_id=1, organism_id=1, created_at=2025-06-20 11:35:38 UTC)

Congratulations, you have ingested your data into the database and you are done! 🎉

If you wish to share metadata with others via the remote immunohub instance (note that data will be kept locally):

# make sure you restart your notebook session
!lamin load covim/immunohub

import lamindb as ln
import lnschema_covim as cv

artifact = ln.Artifact.filter(...).one()
# or loop over artifacts associated with a biosample
# for artifact in biosample.artifacts.all():
cv.transfer_artifact_to_immunohub(artifact)

Confirm data is correctly ingested¶

You can now check whether the data is correctly ingested.

If you want to learn more on how you can interact with this database, take a look at the general guide of LaminDB.

Check files linked to the biosample:

biosample.artifacts.df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
2	z6nhhqqsAGyJv7DW0000	curated/Bonn_Mockfile_1.h5ad	None	.h5ad	dataset	AnnData	6878592	w1tcBwOshHfaAbxq-_5iNA	None	52780	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:38.027000+00:00	1	None	1
3	8FnCsUY8yhEuxYWt0000	curated/Bonn_Mockfile_2.h5ad	None	.h5ad	dataset	AnnData	9705912	mj-XlUwEYYfMokMdhtWBdQ	None	74845	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:38.505000+00:00	1	None	1
4	olgXgqRpQLa3hhAx0000	curated/Bonn_Mockfile_3.h5ad	None	.h5ad	dataset	AnnData	2598016	a6iFbjqFSqvTYom5ZVX3MA	None	19338	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:38.927000+00:00	1	None	1

Check linked metadata records:

biosample.primary_diagnosis

Comorbidity(uid='3587tO5t', ontology_id='U07.2', name='COVID-19, virus not identified', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:33 UTC)

biosample.secondary_infections.list()

[Infection(uid='4xDuGFBE', ontology_id='A01.0', name='Typhoid fever', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:34 UTC),
 Infection(uid='2IY6EfBB', ontology_id='A02.1', name='Salmonella sepsis', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:34 UTC)]

biosample.comorbidities.list()

[Comorbidity(uid='7aBlR4dA', ontology_id='U07.1', name='COVID-19', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:34 UTC),
 Comorbidity(uid='3L1EqfoP', ontology_id='I51.9', name='Heart disease, unspecified', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:35 UTC),
 Comorbidity(uid='1ergEUAW', ontology_id='K83.9', name='Disease of biliary tract, unspecified', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:35 UTC)]

biosample.medications.list()

[Medication(uid='1mDLWRT4', ontology_id='A01AC02', name='dexamethasone', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:36 UTC),
 Medication(uid='6gEjoyp8', ontology_id='A01AC03', name='hydrocortisone', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)]

Query the database¶

Query biosamples¶

biosample = cv.Biosample.filter(name="Mock 1").one()
biosample

Biosample(uid='38gDDpfmHy9B', name='Mock 1', internal_id='1206', institute_abbreviation='Site_A', sample_id='hidden', sample_type='plasma', sample_type_comments='Has a bunch of weird tissue.', data_modality='single-cell sequencing', experiment_type='RNA sequencing', instrument_type='iSeq 100', protocol_details='Standard 10X', sequencing_platform='10X Chromium', alignment_pipeline='BWA', reference_genome='GRCh38', detection_method='not applicable', specificity='MALAT1', experiment_description='Just some experiment where we put mice subject to smoke.', donor_uid='7LjRnBdYD0BA', donor_id='hidden', sex='male', age=2, immunocompromised_class='immunodeficiency_transplant', immunocompromised_condition='kidney transplantation', immunocompromised_category=Comorbidity(uid='5ab5fyj4', ontology_id='Z94.0', name='Kidney transplant status', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC), year_last_transplant='2014', date_of_sample_collection=2025-01-03, timepoint=2, group='CON', pcr_test_result='negative', date_of_pcr_test=2024-02-01, days_post_symptom_onset=20, days_post_icu_admission=2, severity=1, maximal_severity=2, severity_criteria='NIH', disease_phase='convalescent', vaccination_status=1, date_of_vaccination=2024-02-06, status_on_release='recovered', branch_id=1, space_id=1, created_by_id=1, run_id=1, primary_diagnosis_id=1, organism_id=1, created_at=2025-06-20 11:35:38 UTC)

Query data objects¶

Let’s first query for all the ingested files that are curated:

ln.Artifact.filter(key__startswith="curated/").df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
2	z6nhhqqsAGyJv7DW0000	curated/Bonn_Mockfile_1.h5ad	None	.h5ad	dataset	AnnData	6878592	w1tcBwOshHfaAbxq-_5iNA	None	52780	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:38.027000+00:00	1	None	1
3	8FnCsUY8yhEuxYWt0000	curated/Bonn_Mockfile_2.h5ad	None	.h5ad	dataset	AnnData	9705912	mj-XlUwEYYfMokMdhtWBdQ	None	74845	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:38.505000+00:00	1	None	1
4	olgXgqRpQLa3hhAx0000	curated/Bonn_Mockfile_3.h5ad	None	.h5ad	dataset	AnnData	2598016	a6iFbjqFSqvTYom5ZVX3MA	None	19338	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:38.927000+00:00	1	None	1

Query for a single file by key:

artifact = ln.Artifact.filter(key="curated/Bonn_Mockfile_1.h5ad").one()
artifact

Artifact(uid='z6nhhqqsAGyJv7DW0000', is_latest=True, key='curated/Bonn_Mockfile_1.h5ad', suffix='.h5ad', kind='dataset', otype='AnnData', size=6878592, hash='w1tcBwOshHfaAbxq-_5iNA', n_observations=52780, branch_id=1, space_id=1, storage_id=1, run_id=1, schema_id=2, created_by_id=1, created_at=2025-06-20 11:35:38 UTC)

artifact.describe()

Artifact .h5ad/AnnData
├── General
│   ├── .uid = 'z6nhhqqsAGyJv7DW0000'
│   ├── .key = 'curated/Bonn_Mockfile_1.h5ad'
│   ├── .size = 6878592
│   ├── .hash = 'w1tcBwOshHfaAbxq-_5iNA'
│   ├── .n_observations = 52780
│   ├── .path = 
│   │   /home/runner/work/lnschema-covim/lnschema-covim/docs/guide/testdata/.lamindb/z6nhhqqsAGyJv7DW0000.h5ad
│   ├── .created_by = anonymous
│   ├── .created_at = 2025-06-20 11:35:38
│   └── .transform = 'Ingest'
├── Dataset features
│   └── var • 1                     [Feature]                                                           
│       marker                      cat[bionty.CellMarker]     CD11c, CD16, CD1c, CD203c, CD3, CD45, CD…
└── Labels
    └── .biosamples                 covim.Biosample            Mock 1                                   
        .cell_markers               bionty.CellMarker          Cd14, CD66b, Cd19, CD1c, CD203c, CD8, CD…

Access the biosample record linked to this file:

biosample = artifact.biosamples.first()

biosample

Biosample(uid='38gDDpfmHy9B', name='Mock 1', internal_id='1206', institute_abbreviation='Site_A', sample_id='hidden', sample_type='plasma', sample_type_comments='Has a bunch of weird tissue.', data_modality='single-cell sequencing', experiment_type='RNA sequencing', instrument_type='iSeq 100', protocol_details='Standard 10X', sequencing_platform='10X Chromium', alignment_pipeline='BWA', reference_genome='GRCh38', detection_method='not applicable', specificity='MALAT1', experiment_description='Just some experiment where we put mice subject to smoke.', donor_uid='7LjRnBdYD0BA', donor_id='hidden', sex='male', age=2, immunocompromised_class='immunodeficiency_transplant', immunocompromised_condition='kidney transplantation', immunocompromised_category=Comorbidity(uid='5ab5fyj4', ontology_id='Z94.0', name='Kidney transplant status', branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC), year_last_transplant='2014', date_of_sample_collection=2025-01-03, timepoint=2, group='CON', pcr_test_result='negative', date_of_pcr_test=2024-02-01, days_post_symptom_onset=20, days_post_icu_admission=2, severity=1, maximal_severity=2, severity_criteria='NIH', disease_phase='convalescent', vaccination_status=1, date_of_vaccination=2024-02-06, status_on_release='recovered', branch_id=1, space_id=1, created_by_id=1, run_id=1, primary_diagnosis_id=1, organism_id=1, created_at=2025-06-20 11:35:38 UTC)

Read in data¶

Load it into memory as an AnnData object:

Note

load uses readfcs.read under the hood.

adata = artifact.load()
adata

AnnData object with n_obs × n_vars = 52780 × 22
    var: 'n', 'channel', 'marker', 'PnB', 'PnR', 'PnG', 'PnE'
    uns: 'meta'

Query data based on cell markers¶

cell_markers = bt.CellMarker.lookup()

ln.Artifact.filter(feature_sets__cell_markers=cell_markers.cd8).list()

[]

Update existing biosample records¶

To update an existing biosample, simply rerun .ingest by passing update=True:

# here we have new versions of the files in another directory
# you can also modify the files column to include the new keys without passing a new basedir: Bonn_Mockfile_v2/Bonn_Mockfile_1.fcs ...
basedir = "../../lnschema_covim/datasets/Bonn_Mockfile_v2"

biosample = cv.ingest_markers(df, basedir=basedir, update=True)

→ updating relationship records...

→ returning existing Study record with same name: 'A COVID-19 study on smoked mice.'

→ returning existing Reference record with same name: 'COVIM study by Einstein, A.; Theis J., F.'

→ updating curated files records...

→ returning existing Feature record with same name: 'marker'

→ returning existing schema with same hash: Schema(uid='7Be8O9pNRZNdgzsM', n=1, is_type=False, itype='Feature', hash='LaIQq6vJLW1jLqWNdjfMIw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

→ returning existing schema with same hash: Schema(uid='xDhIaviqoU0AW9Fq', n=-1, is_type=False, itype='Composite', otype='AnnData', dtype='num', hash='W7nL3a2jzF7VnbHONKD5CQ', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

! using default organism = human

! using default organism = human

! using default organism = human

! 4 terms not validated in feature 'marker' in slot 'var': 'CD14', 'CD19', 'CD4', 'HLA-DR'
    4 synonyms found: "CD14" → "Cd14", "CD19" → "Cd19", "CD4" → "Cd4", "HLA-DR" → "HLADR"
    → curate synonyms via: .standardize("marker")

! using default organism = human

! using default organism = human

! using default organism = human

→ creating new artifact version for key='curated/Bonn_Mockfile_1.h5ad' (storage: '/home/runner/work/lnschema-covim/lnschema-covim/docs/guide/testdata')

→ returning existing schema with same hash: Schema(uid='7Be8O9pNRZNdgzsM', n=1, is_type=False, itype='Feature', hash='LaIQq6vJLW1jLqWNdjfMIw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

! using default organism = human

! using default organism = human

! using default organism = human

! 4 terms not validated in feature 'marker' in slot 'var': 'CD14', 'CD19', 'CD4', 'HLA-DR'
    4 synonyms found: "CD14" → "Cd14", "CD19" → "Cd19", "CD4" → "Cd4", "HLA-DR" → "HLADR"
    → curate synonyms via: .standardize("marker")

! using default organism = human

! using default organism = human

! using default organism = human

→ creating new artifact version for key='curated/Bonn_Mockfile_2.h5ad' (storage: '/home/runner/work/lnschema-covim/lnschema-covim/docs/guide/testdata')

→ returning existing schema with same hash: Schema(uid='7Be8O9pNRZNdgzsM', n=1, is_type=False, itype='Feature', hash='LaIQq6vJLW1jLqWNdjfMIw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

! using default organism = human

! using default organism = human

! using default organism = human

! 4 terms not validated in feature 'marker' in slot 'var': 'CD14', 'CD19', 'CD4', 'HLA-DR'
    4 synonyms found: "CD14" → "Cd14", "CD19" → "Cd19", "CD4" → "Cd4", "HLA-DR" → "HLADR"
    → curate synonyms via: .standardize("marker")

! using default organism = human

! using default organism = human

! using default organism = human

→ creating new artifact version for key='curated/Bonn_Mockfile_3.h5ad' (storage: '/home/runner/work/lnschema-covim/lnschema-covim/docs/guide/testdata')

→ returning existing schema with same hash: Schema(uid='7Be8O9pNRZNdgzsM', n=1, is_type=False, itype='Feature', hash='LaIQq6vJLW1jLqWNdjfMIw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-20 11:35:37 UTC)

→ updating biosample record...

You’ll notice that files with the same filenames will be assigned with new version:

biosample.artifacts.df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
2	z6nhhqqsAGyJv7DW0000	curated/Bonn_Mockfile_1.h5ad	None	.h5ad	dataset	AnnData	6878592	w1tcBwOshHfaAbxq-_5iNA	None	52780	md5	True	False	1	1	2	None	False	1	2025-06-20 11:35:38.027000+00:00	1	None	1
3	8FnCsUY8yhEuxYWt0000	curated/Bonn_Mockfile_2.h5ad	None	.h5ad	dataset	AnnData	9705912	mj-XlUwEYYfMokMdhtWBdQ	None	74845	md5	True	False	1	1	2	None	False	1	2025-06-20 11:35:38.505000+00:00	1	None	1
4	olgXgqRpQLa3hhAx0000	curated/Bonn_Mockfile_3.h5ad	None	.h5ad	dataset	AnnData	2598016	a6iFbjqFSqvTYom5ZVX3MA	None	19338	md5	True	False	1	1	2	None	False	1	2025-06-20 11:35:38.927000+00:00	1	None	1
5	z6nhhqqsAGyJv7DW0001	curated/Bonn_Mockfile_1.h5ad	None	.h5ad	dataset	AnnData	6878592	NxpH6lNn4oGyyTtoAIWBxw	None	52780	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:40.108000+00:00	1	None	1
6	8FnCsUY8yhEuxYWt0001	curated/Bonn_Mockfile_2.h5ad	None	.h5ad	dataset	AnnData	9705912	V0OTnlrrsy7uAAn70d999A	None	74845	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:40.596000+00:00	1	None	1
7	olgXgqRpQLa3hhAx0001	curated/Bonn_Mockfile_3.h5ad	None	.h5ad	dataset	AnnData	2598016	iRtwuHPK3rQ5O0XfcxAclw	None	19338	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:41.033000+00:00	1	None	1

Query both versions:

biosample.artifacts.filter(key__endswith="Bonn_Mockfile_1.h5ad").df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
2	z6nhhqqsAGyJv7DW0000	curated/Bonn_Mockfile_1.h5ad	None	.h5ad	dataset	AnnData	6878592	w1tcBwOshHfaAbxq-_5iNA	None	52780	md5	True	False	1	1	2	None	False	1	2025-06-20 11:35:38.027000+00:00	1	None	1
5	z6nhhqqsAGyJv7DW0001	curated/Bonn_Mockfile_1.h5ad	None	.h5ad	dataset	AnnData	6878592	NxpH6lNn4oGyyTtoAIWBxw	None	52780	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:40.108000+00:00	1	None	1

Get latest version of each artifact:

biosample.artifacts.all().latest_version().df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
5	z6nhhqqsAGyJv7DW0001	curated/Bonn_Mockfile_1.h5ad	None	.h5ad	dataset	AnnData	6878592	NxpH6lNn4oGyyTtoAIWBxw	None	52780	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:40.108000+00:00	1	None	1
6	8FnCsUY8yhEuxYWt0001	curated/Bonn_Mockfile_2.h5ad	None	.h5ad	dataset	AnnData	9705912	V0OTnlrrsy7uAAn70d999A	None	74845	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:40.596000+00:00	1	None	1
7	olgXgqRpQLa3hhAx0001	curated/Bonn_Mockfile_3.h5ad	None	.h5ad	dataset	AnnData	2598016	iRtwuHPK3rQ5O0XfcxAclw	None	19338	md5	True	False	1	1	2	None	True	1	2025-06-20 11:35:41.033000+00:00	1	None	1

Get the latest version of an artifact:

artifact = biosample.artifacts.filter(key__endswith="Bonn_Mockfile_1.h5ad")
artifact.latest_version()

<ArtifactBasicQuerySet [Artifact(uid='z6nhhqqsAGyJv7DW0001', is_latest=True, key='curated/Bonn_Mockfile_1.h5ad', suffix='.h5ad', kind='dataset', otype='AnnData', size=6878592, hash='NxpH6lNn4oGyyTtoAIWBxw', n_observations=52780, branch_id=1, space_id=1, storage_id=1, run_id=1, schema_id=2, created_by_id=1, created_at=2025-06-20 11:35:40 UTC)]>