Cover image
codepythonmedicalDICOMdata-analysis

RadFind - DICOM Metadata Search Tool

A Python-based tool for searching through medical imaging DICOM files to extract and analyze metadata across large datasets.

January 18, 2025

RadFind - DICOM Metadata Search Tool

RadFind is a Python application I developed to address a common problem in radiological research: efficiently searching through large datasets of DICOM files to extract relevant metadata for analysis.

Project Overview

Medical imaging files contain extensive metadata that can be valuable for research, quality improvement, and clinical operations. However, extracting this data across thousands of studies can be challenging.

RadFind provides:

  • Fast, indexed searching across DICOM datasets
  • Flexible query building with Boolean operators
  • Export capabilities to CSV, JSON, and SQL formats
  • Integration with common statistical packages
  • Privacy controls to ensure HIPAA compliance

Core Technologies

  • Python 3.10
  • pydicom for DICOM parsing
  • FastAPI for the backend API
  • SQLite for the search index
  • React for the frontend interface

Code Sample

def extract_metadata(dicom_file, fields=None):
    """
    Extract specific metadata fields from a DICOM file
    
    Args:
        dicom_file (str): Path to DICOM file
        fields (list): List of DICOM tags to extract, or None for all
        
    Returns:
        dict: Dictionary of extracted metadata
    """
    try:
        ds = pydicom.dcmread(dicom_file)
        
        if fields is None:
            # Extract commonly useful fields by default
            result = {
                'PatientID': _anonymize_if_needed(ds.get('PatientID', '')),
                'StudyDate': ds.get('StudyDate', ''),
                'Modality': ds.get('Modality', ''),
                'StudyDescription': ds.get('StudyDescription', ''),
                'SeriesDescription': ds.get('SeriesDescription', ''),
                'SliceThickness': ds.get('SliceThickness', ''),
                'Manufacturer': ds.get('Manufacturer', ''),
                'InstitutionName': ds.get('InstitutionName', '')
            }
        else:
            # Extract only requested fields
            result = {}
            for field in fields:
                if field in ds:
                    value = ds.get(field, '')
                    # Apply privacy rules to sensitive fields
                    if field in SENSITIVE_FIELDS:
                        value = _anonymize_if_needed(value)
                    result[field] = value
        
        return result
    except Exception as e:
        logger.error(f"Error extracting metadata from {dicom_file}: {str(e)}")
        return {}

Impact

RadFind is currently used by three research groups at my institution, helping to accelerate studies in areas such as:

  • Quality assessment of CT protocols
  • Radiation dose monitoring and optimization
  • Correlation of imaging parameters with diagnostic accuracy

The tool has helped reduce the time needed for data collection in these studies by approximately 70%, allowing researchers to focus more on analysis and less on data extraction.

Future Development

I'm currently working on adding:

  • Natural language processing for report text analysis
  • Integration with PACS systems for direct queries
  • Machine learning components for automated image quality assessment

The source code is available on GitHub under the MIT license.