# DigitalPathology HTAN Digital Pathology Data Model Schema for Phase 2 📥 [Download attributes as CSV](csv/digitalpathology.csv) If submitting Digital Pathology files, here are the list of attributes you need to fill out: ## DigitalPathologyData **Container for digital pathology imaging data** ### Core File Attributes These attributes are inherited from CoreFileAttributes and apply to all file-based data. | Attribute | Type | Required | Description | |-----------|------|----------|-------------| | `HTAN_DATA_FILE_ID` | string, pattern: ^(?=.{1,50}$)(HTA2[0-2][0-9])_(0000\|EXT[0-9]{1,18}\|[0-9]{1,21})_(D[0-9]{1,20})$ | Yes | HTAN Data File ID (Primary Key) | | `HTAN_PARENT_ID` | string, pattern: ^(?=.{1,50}$)(HTA2[0-2][0-9])_(0000\|EXT[0-9]{1,18}\|[0-9]{1,21})_([BD][0-9]{1,20})$ | Yes | HTAN Parent ID - Foreign Key to parent entity (B for Biospecimen, D for data file). Must have B or D suffix. Supports HTA200-229 for phase 2. | ### Base Imaging Attributes These attributes are inherited from BaseImagingAttributes. | Attribute | Type | Required | Description | |-----------|------|----------|-------------| | `CITATION_OR_DOI` | string, pattern: ^(?:(?:https?)://)(?:\S+(?::\S*)?@)?(?:(?!(?:10\|127)(?:\.\d{1,3}){3})(?!(?:169\.254\|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]\|2\d\|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?\|1\d\d\|2[01]\d\|22[0-3])(?:\.(?:1?\d{1,2}\|2[0-4]\d\|25[0-5])){2}(?:\.(?:[1-9]\d?\|1\d\d\|2[0-4]\d\|25[0-4]))\|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$ | Yes | Raw Data Protocol or Digital Object Identifier Text; Publication and/or digital object identifier of the publication for open access studies. Must be a valid URL (http or https). | | `DE_IDENTIFICATION_METHOD_DESCRIPTION` | string | No | Description of the process of removing potentially identifying data or data elements to render data into a form that does not identify individuals and where identification is not likely to take place. | | `DE_IDENTIFICATION_METHOD_TYPE` | [DeIdentificationMethodType](#deidentificationmethodtype) | Yes | De-identification Method Type | | `DE_IDENTIFICATION_SOFTWARE` | string | No | Software that was used to de-identify the images (if used) | | `DE_IDENTIFIED` | boolean | Yes | Confirm that any HIPAA identifiers are redacted, masked, or not present in the slide label and that any dates or strings present in internal metadata does not represent PHI | | `EXPERIMENTAL_STRATEGY_AND_DATA_SUBTYPES` | [ExperimentalStrategyAndDataSubtypes](#experimentalstrategyanddatasubtypes) | Yes | What is the experimental strategy used for the study (or what type of data subtypes exist in the study)? Per RFC, the only valid value for imaging data types is "Pathological". | | `HAS_SLIDE_LABEL` | boolean | Yes | Does the image contain a slide label | | `IMAGE_MODALITY` | [ImageModality](#imagemodality) | Yes | The method in which the images are generated. | | `IMAGING_EQUIPMENT_MANUFACTURER` | string | Yes | Producer of the imaging equipment that was used to generate the digital image | | `IMAGING_EQUIPMENT_MODEL` | string | No | The words used to describe the specific model of the instrument used to carry out an imaging experiment | | `IMAGING_PROTOCOL` | string, pattern: ^(?:(?:https?)://)(?:\S+(?::\S*)?@)?(?:(?!(?:10\|127)(?:\.\d{1,3}){3})(?!(?:169\.254\|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]\|2\d\|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?\|1\d\d\|2[01]\d\|22[0-3])(?:\.(?:1?\d{1,2}\|2[0-4]\d\|25[0-5])){2}(?:\.(?:[1-9]\d?\|1\d\d\|2[0-4]\d\|25[0-4]))\|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$ | No | A rule which guides how an activity should be performed. Protocols.io ID or DOI link to a free/open protocol resource describing in detail the assay protocol. Must be a valid URL (http or https). | | `IMAGING_SOFTWARE` | string | No | The name of the software package that was used to capture, generate, and process the image | | `IMMERSION` | [ImmersionMedium](#immersionmedium) | No | Immersion medium. Each objective is designed for a specific immersion medium, which is marked on the objective. The main types of immersion media are air, oil, and water. | | `LENS_NUMERICAL_APERTURE` | float | No | The numerical aperture of the lens. Floating point value > 0. | | `LICENSE` | [License](#license) | Yes | Official or legal permission to do or own a specified thing. Per RFC, the only valid value is "CC BY 4.0". | | `NOMINAL_MAGNIFICATION` | integer | Yes | The magnification of the lens as specified by the manufacturer - i.e. '60' is a 60X lens. Integer value >= 0 (no units) | | `OBJECTIVE` | string | Yes | The manufacturer and or model number for the optical element that gathers light from an object being observed and focuses the light rays from it to produce a real image of the object | | `PASSED_QC` | boolean | Yes | Confirm that the image has passed internal quality control checks | | `QC_COMMENT` | string | Yes | Comments related to quality control checks | | `SLIDE_LABEL_REDACTED` | boolean | No | Have identifiers including dates been masked in the label image | | `SPECIES` | [Species](#species) | Yes | NCBI Taxonomy ID. Per RFC, the only valid value is "9606 (Homo sapiens)". | | `STAINING_METHOD` | [StainingMethod](#stainingmethod) | Yes | Any of the various methods that use a dye, reagent, or other material for producing coloration in tissues or microorganisms for microscopic examination | ### Module-Specific Attributes | Attribute | Type | Required | Description | |-----------|------|----------|-------------| | `ANNOTATION_TYPE` | [AnnotationType](#annotationtype) | Required IF HAS_ANNOTATIONS = true | What types of annotation are contained in the image | | `FILENAME` | string, pattern: ^.+\.(ome\.(tif\|tiff\|tf2\|tf8\|btf)\|tiff?\|qptiff\|svs)$ | Yes | Name of the file. Must end with an extension matching the FILE_FORMAT (.ome.tif, .ome.tiff, .ome.tf2, .ome.tf8, .ome.btf for ome-tiff; .tiff or .tif for tiff; .qptiff for qptiff; .svs for svs) | | `FILE_FORMAT` | string, pattern: ^(ome-tiff\|tiff\|qptiff\|svs)$ | Yes | Format of the imaging file. Must be compatible with Bio-Formats or OpenSlide Python. OME-TIFF files use extensions .ome.tif, .ome.tiff, .ome.tf2, .ome.tf8, or .ome.btf | | `HAS_ANNOTATIONS` | boolean | Yes | Does the image contain annotations | ## Enums ### AnnotationType | Value | Description | |-------|-------------| | `Artifact` | Artifact annotation | | `Cell` | Cell annotation | | `Nucleus` | Nucleus annotation | | `ROI` | Region of Interest annotation | | `Tissue` | Tissue annotation | ### DeIdentificationMethodType | Value | Description | |-------|-------------| | `Automatic` | Automatic de-identification method | | `Manual` | Manual de-identification method | | `Not Applicable` | De-identification not applicable | | `Semiautomatic` | Semi-automatic de-identification method | ### ExperimentalStrategyAndDataSubtypes | Value | Description | |-------|-------------| | `Pathological` | Pathological experimental strategy and data subtype | ### ImageModality | Value | Description | |-------|-------------| | `SM` | Slide Microscopy | ### ImmersionMedium | Value | Description | |-------|-------------| | `Air` | Air immersion medium | | `Glycerol` | Glycerol immersion medium | | `Oil` | Oil immersion medium | | `Other` | Other immersion medium | | `Water` | Water immersion medium | ### License | Value | Description | |-------|-------------| | `CC BY 4.0` | Creative Commons Attribution 4.0 International License | ### Species | Value | Description | |-------|-------------| | `9606 (Homo sapiens)` | NCBI Taxonomy ID for Homo sapiens | ### StainingMethod | Value | Description | |-------|-------------| | `CODEX` | CODEX staining method | | `CyCIF` | Cyclic Immunofluorescence staining method | | `ExSeq` | Expansion Sequencing staining method | | `GeoMX-DSP` | GeoMX Digital Spatial Profiling staining method | | `H&E` | Hematoxylin and Eosin staining method | | `IHC` | Immunohistochemistry staining method | | `IMC` | Imaging Mass Cytometry staining method | | `MERFISH` | Multiplexed Error-Robust Fluorescence In Situ Hybridization staining method | | `MIBI` | Multiplexed Ion Beam Imaging staining method | | `MxIF` | Multiplexed Immunofluorescence staining method | | `Not Applicable` | Staining not applicable | | `SABER` | Signal Amplification By Exchange Reaction staining method | | `mIHC` | Multiplexed Immunohistochemistry staining method | | `t-CyCIF` | Tissue Cyclic Immunofluorescence staining method |