A Python-based tool for searching through medical imaging DICOM files to extract and analyze metadata across large datasets.
January 18, 2025
RadFind is a Python application I developed to address a common problem in radiological research: efficiently searching through large datasets of DICOM files to extract relevant metadata for analysis.
Medical imaging files contain extensive metadata that can be valuable for research, quality improvement, and clinical operations. However, extracting this data across thousands of studies can be challenging.
RadFind provides:
def extract_metadata(dicom_file, fields=None):
"""
Extract specific metadata fields from a DICOM file
Args:
dicom_file (str): Path to DICOM file
fields (list): List of DICOM tags to extract, or None for all
Returns:
dict: Dictionary of extracted metadata
"""
try:
ds = pydicom.dcmread(dicom_file)
if fields is None:
# Extract commonly useful fields by default
result = {
'PatientID': _anonymize_if_needed(ds.get('PatientID', '')),
'StudyDate': ds.get('StudyDate', ''),
'Modality': ds.get('Modality', ''),
'StudyDescription': ds.get('StudyDescription', ''),
'SeriesDescription': ds.get('SeriesDescription', ''),
'SliceThickness': ds.get('SliceThickness', ''),
'Manufacturer': ds.get('Manufacturer', ''),
'InstitutionName': ds.get('InstitutionName', '')
}
else:
# Extract only requested fields
result = {}
for field in fields:
if field in ds:
value = ds.get(field, '')
# Apply privacy rules to sensitive fields
if field in SENSITIVE_FIELDS:
value = _anonymize_if_needed(value)
result[field] = value
return result
except Exception as e:
logger.error(f"Error extracting metadata from {dicom_file}: {str(e)}")
return {}
RadFind is currently used by three research groups at my institution, helping to accelerate studies in areas such as:
The tool has helped reduce the time needed for data collection in these studies by approximately 70%, allowing researchers to focus more on analysis and less on data extraction.
I'm currently working on adding:
The source code is available on GitHub under the MIT license.