Vidare till huvudinnehåll
Sök

Principal Clinical Data Science

Location Boston, Massachusetts, USA Jobb-id R-228172 Datum inlagd 06/03/2025

This is what you will do:

Alexion is looking for a highly motivated and skilled Principal Clinical Data Scientist within our growing Data Science team. You will be central to applying leading-edge data science techniques to unlock significant insights from clinical trial data, with a dedicated focus on wearables, echocardiography imaging, real-world data, and omics data, leveraging resources such as the UK Biobank and our internal data lakes.

You will collaborate closely with clinical researchers, imaging specialists, and software engineers to drive innovation and accelerate the development of therapies for rare diseases. This is an exciting opportunity to make a significant impact on patients' lives by applying your expertise in machine learning and AI to complex and meaningful clinical data

You will be responsible for

Key Responsibilities:

  • Design, develop, and implement machine learning and deep learning models to analyze data from wearable sensors (e.g., activity trackers, continuous glucose monitors) and echocardiography images.

  • Develop and validate algorithms for feature extraction, pattern recognition, and predictive modeling using wearable and imaging data.

  • Integrate and analyze diverse clinical datasets, including electronic health records (EHRs), genomic data, and patient-reported outcomes, alongside wearable and imaging data.

  • Collaborate with clinical teams to define research questions and identify opportunities for leveraging data science to improve clinical trial design, patient monitoring, and disease understanding in rare diseases.

  • Develop and maintain robust data pipelines for processing, cleaning, and analyzing large-scale datasets.

  • Contribute to the development of visualization tools and dashboards to effectively communicate findings to both technical and non-technical audiences.

  • Stay abreast of the latest advancements in machine learning, AI, and data science, particularly in the context of healthcare and digital biomarkers.

  • Document methodologies, results, and findings clearly and concisely.

  • Ensure compliance with relevant regulatory guidelines and data privacy standards.

Process Improvement:Actively contribute to the development of standard processes aimed at enhancing the quality, efficiency, and effectiveness of the Clinical Bioinformatics group.

You will need to have:

  • Ph.D. degree in Data Science, Biostatistics, Computer Science, Biomedical Engineering, or a related quantitative field.

  • 2-4 years post-doctoral of hands-on experience applying machine learning and statistical modeling techniques to real-world datasets, preferably within a clinical research or healthcare setting.

  • Demonstrated experience working with data from wearable devices (e.g., accelerometers, heart rate monitors, sleep trackers) and a strong understanding of signal processing techniques.

  • Experience in analyzing medical imaging data, particularly echocardiography, including image processing, feature extraction, and the application of computer vision techniques.

  • Proficiency in programming languages such as Python and R, along with relevant libraries for data manipulation, statistical analysis, and machine learning (e.g., pandas, NumPy, scikit-learn, TensorFlow, PyTorch).

  • Experience with cloud computing platforms (e.g., AWS, Azure, GCP) and big data technologies (e.g., Spark, Hadoop) is a plus.

  • Strong understanding of statistical inference, hypothesis testing, and model evaluation metrics.

  • Excellent problem-solving, critical thinking, and analytical skills.

  • Strong communication and collaboration skills, with the ability to effectively present complex technical information to diverse audiences.

  • A strong interest in rare diseases and a desire to contribute to the development of innovative therapies for patients with unmet medical needs.

Technical Skills & Familiarity:

The ideal candidate will be familiar with a combination of the following techniques and concepts:

Machine Learning & AI:

  • Supervised Learning: Regression (linear, polynomial, etc.), Classification (logistic regression, support vector machines, decision trees, random forests, gradient boosting machines like XGBoost, LightGBM, CatBoost).

  • Unsupervised Learning: Clustering (k-means, hierarchical clustering, DBSCAN), dimensionality reduction (PCA, t-SNE, UMAP), anomaly detection.

  • Deep Learning: Convolutional Neural Networks (CNNs) for image analysis, Recurrent Neural Networks (RNNs) and Transformers for time-series data (wearable data), autoencoders.

  • Model Evaluation & Selection: Cross-validation, hyperparameter tuning, performance metrics (e.g., AUC, F1-score, RMSE), bias-variance trade-off.

  • Explainable AI (XAI): Techniques for understanding and interpreting machine learning model predictions (e.g., SHAP, LIME).

  • Time Series Analysis: Feature engineering, forecasting models (e.g., ARIMA, Prophet), dynamic time warping.

Wearable Data Analysis:

  • Signal Processing: Filtering, noise reduction, feature extraction from raw sensor data (e.g., frequency domain analysis).

  • Activity Recognition: Developing models to classify different types of physical activity.

  • Sleep Analysis: Algorithms for sleep stage classification and sleep quality assessment.

  • Event Detection:Identifying specific events or patterns in wearable sensor data.

Echocardiography Image Analysis:

  • Image Preprocessing: Noise reduction, normalization, augmentation techniques.

  • Image Segmentation: Techniques for identifying and delineating cardiac structures.

  • Object Detection:Identifying key landmarks or regions of interest in echocardiograms.

  • Quantitative Image Analysis: Extracting clinically relevant measurements from images.

  • General Data Science & Programming:

  • Python: Pandas, NumPy, SciPy, scikit-learn, TensorFlow, Keras, PyTorch, Matplotlib, Seaborn.

  • R: Base R, tidyverse, caret.

  • SQL: For querying and manipulating databases.

  • Data Visualization: Creating informative and visually appealing plots and dashboards.

  • Data Wrangling & Cleaning: Handling missing data, outliers, and data inconsistencies.

Statistical Concepts:

  • Hypothesis testing, statistical power, confidence intervals.

  • Regression analysis, survival analysis.

  • Longitudinal data analysis

AstraZeneca embraces diversity and equality of opportunity. We are committed to building an inclusive and diverse team representing all backgrounds, with as wide a range of perspectives as possible, and harnessing industry-leading skills. We believe that the more inclusive we are, the better our work will be. We welcome and consider applications to join our team from all qualified candidates, regardless of their characteristics. We comply with all applicable laws and regulations on non-discrimination in employment (and recruitment), as well as work authorization and employment eligibility verification requirements.

Gå med i vår talanggrupp

Jag är intresserad av

Lady reading through some files