GENERAL SUMMARY/ OVERVIEW STATEMENT: Summarize the nature and level of work performed.
The Massachusetts General Hospital (MGH) Corrigan Minehan Heart Center and Cardiovascular Research Center is seeking a highly motivated individual with expertise in Data Science and Machine Learning to work on multiple clinical and translational research projects focused on cardiovascular disease.
This Data Scientist position offers the opportunity to work directly with the Senior Investigators in the Division of Cardiology at MGH and a cross-institutional collaborative multidisciplinary team of clinicians, scientists, and trainees. The candidate will perform research and development of advanced analytics solutions in healthcare and have the opportunity to lead machine learning based analyses for projects related to heart disease, and to leverage conventional and advanced approaches to analyze large unique high-dimensional complex databases of human data.
The ideal candidate will have a demonstrated ability to manage relational datasets across different data types and build upon experience in software development / engineering. The position will build upon concepts including feature engineering, statistics, data visualization, and optimization, to lead machine learning based analyses of complex data involving tabular, free text, and imaging data.
PRINCIPAL DUTIES AND RESPONSIBILITIES: Indicate key areas of responsibility, major job duties, special projects and key objectives for this position. These items should be evaluated throughout the year and included in the written annual evaluation.
The Data Scientist will work on a team focused on machine learning, predictive analytics and related projects, use machine learning frameworks in both development, testing and production environments to create and deploy new technologies. This role will perform research and development of advanced analytics solutions in healthcare, create machine learning algorithms to optimize and deliver results by reducing computational complexity and increasing the accuracy of models and improving on business metrics, and interacting with subject matter experts to uncover important insights and build on concepts in statistics, optimization, and machine learning to improve patient and provider experiences. This role is responsible for building reliable and scalable production machine learning services which will be integrated with electronic health record (EHR) systems, working on feature engineering, statistical analysis, developing novel ML techniques, understanding classifier performance, and ensuring fit-for-purpose, working cross-functionally across diverse stakeholders, including developers, data engineers, EHR specialists and physicians. This position may be assigned to work on special projects and other job duties, as needed.
Primary Responsibilities:
- Collaborates with business stakeholders to document business problems requiring advanced analytics; develop, evaluate, and implement machine learning models to improve business operations.
- Collaborates with Software Engineering, Data Engineering team and clinical team to structure and normalize data from medical records EHR systems and other data enrichment sources.
- Works on feature engineering, statistical analysis, develop novel ML techniques, understanding classifier performance, and ensuring fit-for-purpose. Create machine learning algorithms to optimize and deliver results, perform hyper-parameter tuning, and evaluate model performance.
- Works with EHR system Cognitive Computing platform to improve predictions about patient outcomes, focuses on configuring, operationalizing, and reporting on vendor developed models.
- Conducts research in state-of-the-art deep learning models focused on EHR data, cardiac tools (ECG, imaging), and publish your novel findings.
- Visualizes and interprets data and model results.
- Uses machine learning frameworks in both development, testing, and production environments.
- Participates on all parts of the machine learning project lifecycle from dataset curation to model deployment.
- Coordinate and participate in collaborations with investigators within other Harvard departments and at other institutions across the U.S.
- Participate in the preparation and presentation of study results in manuscripts, conference abstracts, or other publication mediums.
- Participate in and report on assigned project status at bi-weekly team meetings.
- Provide technical guidance to lab members on projects at the intersection of engineering and cardiology.
- Actively participate in continuing education and mentorship of other research team members.
- Proactively conduct quality control/quality assurance of programming and statistical models, automating processes whenever possible.
- Performs all other duties as needed to meet the needs of the department.
SKILLS/ABILITIES/COMPETENCIES REQUIRED: Must be realistic, objective, measurable and related to essential functions of this job.
Position Requirements:
- Bachelor’s degree in Computer Science, Data Science, or a related engineering discipline required. Master’s degree preferred.
- 1-3 years of related work experience required in an engineer role. Industry experience in software engineering and data science desirable
- Understanding of Data Structures and Algorithms. Experience designing and developing scalable distributed systems with deep understanding of architectural and design patterns, object oriented design, modern program languages. Understanding of foundations of Big Data systems and hands-on experience with distributed computational frameworks such as Apache Spark / Apache Beam and NoSQL datastores.
- Experience with working on large datasets, especially with Spark. Experience with Python and Java. Experience with cloud computing platforms such as AWS or GCP. Experience with Kubernetes.
- Knowledge of parallelism and concurrency to leverage modern hardware for serving ML models. Working knowledge of Unix/Linux systems. Ability to access, manage, transfer, integrate, and analyze complex datasets, especially using SQL. Familiarity with libraries such as pandas, TensorFlow, scikit-learn, Kares.
- Advanced technical computer skills as required for technical support specific to functional area and related systems.
- Strong hands-on experience with deep learning tools (e.g., PyTorch or TensorFlow-Keras)
- Documented history of completed software projects (e.g., GitHub)
- Familiarity with model ML/AI algorithms (e.g., Transformers, Diffusion Models)
- Proficiency in oral and written English communication is required.
Skills/Competencies that are highly desired:
- Knowledge and experience using basic and intermediate project management tools and techniques.
- Programming skills with fluency in at least one of Python, Java, Scala, C/C++ and Python; skills and ability to quickly prototype in Python highly desired. Familiar with industry standard software engineering practices and systems knowledge.
- Strong hands-on experience in computer vision, signal processing, and medical imaging
- Proficiency with medical imaging libraries like PyDicom, ITK, Open CV, and HDF5 format.
- Experience and proficiency in prioritizing tasks and requesting support (when needed) while adhering to project deadlines.
- An ability to convey complex statistical concepts to a diverse team of clinical professionals with a varying levels of expertise in statistics.
- Proven track record of producing quality deliverables on time, taking ownership and accountability of assignments, and demonstrating a strong work ethic.
LICENSES, CERTIFICATIONS, and/or REGISTRATIONS (if applicable): Specify minimum credentials and clearly indicate if preferred or required
EDUCATION: Specify minimum education and clearly indicate if preferred or required
- Minimum education required: Bachelor’s degree in Computer Science, Data Science, or a related engineering discipline required. Master’s degree preferred.
EXPERIENCE: Specify minimum creditable years of experience and clearly indicate if preferred or required
- 1-3 years of work experience
SUPERVISORY RESPONSIBILITY (if applicable): List the number of FTEs supervised.
FISCAL RESPONSIBILITY (if applicable): Indicate financial “scope” information, i.e.: size of budget, volume, revenue, etc.; Indicate total physician/non-physician FTE scope
WORKING CONDITIONS: Describe the conditions in which the work is performed.
- Duties will be performed remotely and in an office setting.
EEO Statement
Massachusetts General Hospital is an Affirmative Action Employer. By embracing diverse skills, perspectives and ideas, we choose to lead. All qualified applicants will receive consideration for employment without regard to race, color, religious creed, national origin, sex, age, gender identity, disability, sexual orientation, military service, genetic information, and/or other status protected under law. We will ensure that all individuals with a disability are provided a reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment.