ClosedLoop – Healthcare’s Data Science Platform

The platform is purpose-built to make healthcare-focused data scientists more effective at their jobs. By providing off-the-shelf models for many common healthcare use cases and automating many of the manual processes involved in traditional data science tasks, ClosedLoop allows data scientists to focus on the impactful problems that drive real value for organizations. Don’t have a data science team? Talk to a ClosedLoop specialist today to hear how our team of AI experts can augment your existing analytics team to drive new insights for your organization.
create an account
Data scientists spend 80% of their time doing what they least like to do: collecting existing datasets and organizing data. That leaves less than 20 percent of their time for creative tasks like mining data for patterns that lead to new research discoveries. NIH Strategic Plan for Data Science

Healthcare Specific Features


Healthcare data is notoriously “messy.” ClosedLoop makes it simple to import raw healthcare data sets, such as medical claims, prescriptions, EMR, and custom data, without the need for tedious data normalization and cleansing. Data handling capabilities include:

  • HIPAA-compliant storage and data access
  • Support for fixed snapshots and streaming data
  • Automated data dictionary creation
  • Auto-detection of data types
  • Auto-clean support for common healthcare elements including diagnosis, procedure, and drug codes
  • Support for all major coding systems (ICD 9/10, CPT, HCPCS, NDC, NPI, and SNOMED)
  • Auto-generated summary statistics, e.g. per member per month cost, age and gender summary, etc.
  • Automated quality checks for imported data


After data cleansing, feature engineering is one of the most expensive and time-consuming aspects of data science. ClosedLoop helps healthcare data scientists build models and features smarter and faster—freeing them to focus their time on discovery of new insights. Automated feature engineering capabilities include:

  • Over 800 prebuilt healthcare specific features
  • Automatic mappings to licensed ontologies (GPI, RxNorm, CCS, BETOS, UMLS, and FHIR)
  • Support for complex combinations of events, e.g. initiation of metformin within 60 days of an initial diabetes diagnosis
  • Built-in support for social factors including USDA Food Environment Atlas, CDC Behavioral Risk Factors, Area Deprivation Index, and County Health Rankings
  • Custom feature generators for incorporating novel and proprietary data sources
  • Fully automated model training and evaluation process


Artificial Intelligence, machine learning, and predictive analytics are just buzz words if they don’t change behaviors and workflows. To gain trust and drive adoption, models must be highly accurate and always improving. ClosedLoop provides data scientists with the tools they need to build highly accurate models and to continuously improve those models as new data and insights are surfaced. The following are just a few of ClosedLoop’s capabilities that directly drive accuracy in healthcare predictive models:

  • Baseline model creation using automated features for any prediction in less than 24 hours
  • Custom population and outcome definition to precisely tailor models
  • Natural language processing to extract SNOMED terms from free-text notes
  • Advanced machine learning algorithms utilizing neural network and tree-based ensemble methods
  • Automated model tuning utilizing hyperparameter optimization and cross-validation
  • Local population training support increases accuracy vs. pre-trained models
  • Cross-training support leveraging licensed external data
  • Model versioning to enable testing of new features and accuracy comparison
  • Automated accuracy reporting including ROC and precision/recall curves, model calibration plots, and train and test set performance


Healthcare practitioners demand that predictive models not only be accurate, they must also be explainable. ClosedLoop unpacks the “black box” of artificial intelligence allowing data scientists and clinicians to understand why and how factors impact a model’s prediction, driving faster adoption and better clinical results. Capabilities include:

  • Auto-computed top factors show which variables matter most across an entire population
  • Weighted positive and negative factors for individual patients inform clinical and operational workflows
  • Prediction trends over time show changes in risk for any given outcome as new data is received
  • Factor visualizations help users understand the important factors underlying any prediction


Teamwork and collaboration lead to better outcomes. Yet, in traditional data science roles, data and code are often siloed to individual contributors. To create a community of excellence and foster innovation, data science should be a team sport. ClosedLoop allows data scientists and clinicians to create and iterate on predictive models together. Collaboration capabilities include:

  • Easy-to-use feature language understandable by data scientists and clinicians
  • Feature and model catalogs support reuse of best practices
  • Version-controlled repositories with data provenance tie predictions to the exact data and model that created them
  • Easily understood visualizations for model accuracy, feature importance, and prediction results
  • Straightforward model comparison supports iteration and testing of new features
  • Python and REST APIs for easy integration with Jupyter notebooks and other data science tools


Once a model is trained and reviewed by stakeholders, data scientists often have the challenging job of figuring out how to make the model work in production. With ClosedLoop’s end-to-end solutions, it is easy to operationalize a model and automatically update predictions as new data arrives. Deployment capabilities include:

  • Pushbutton deployment puts new models into production with a single click
  • Automatic update of predictions as new data streams in
  • Built-in error handling for schema changes and data anomalies
  • Automated quality checks on new predictions
  • REST API or push notifications to retrieve predictions
  • Model performance monitoring over time

Build a Predictive Model in 24-Hours