resources

blog

February 1st, 2022

The CytoReason Platform

by Roye Rozov, PhD

People often characterize startups with simple analogies such as ‘Tinder for dogs’ or ‘Airbnb for parking spots.’ At CytoReason, we aim to build the Waze for the immune system – the platform for disease models.

What exactly do we mean by a platform? Two things. The first – a set of tools on the cloud, usually with some data pre-loaded. An example of such a platform, and incidentally one we use internally, is CodeOcean. CodeOcean provides simple tools for quickly spinning up popular data science work environments (eg RStudio or JupyterHub) using AWS as a backend.

The second definition of a platform is more aspirational. Platforms can become de facto standard bearers, clearing houses, or arbiters of truth. This can be achieved in several ways. For example , the platform may inherit this status when it is constructed by a centralized authority (i.e when a platform is built by a government for the public good). In other cases the status can be earned by scale and comprehensiveness, or by virtue of an assurance of quality.

A good example in the commercial domain is Google Maps. Google established Maps as the de facto standard by making the platform comprehensive, accurate, and up-to date. What comes to mind in the public domain, are the NCBI/EBI ecosystems of publicly available databases and tools. NCBI and EBI have been providing free access to massive troves of public genomic data and state of the art open-source tools for analyzing it, for decades.

CytoReason’s goal is to create the world’s first platform for computational disease models, and to turn it into the industry standard.

We developed our disease models to provide a common baseline for all our analysis needs, and to anticipate those of our clients. This includes queryable access to the set of cells in the disease tissue, their abundance levels, and their gene expression differences across patients – all inferred from real clinical samples. We enrich these models further by characterizing the effects of available treatments on each of these features. Each of these analyses are carried out within each relevant dataset, individually and in aggregate, to derive robust summaries of disease activity.

CytoReason considers these models the baseline because their purpose is to enable deeper investigation of the biology at hand. We aim to answer critical questions such as how to rank targets, what drug is most effective and why, or what molecular features distinguish between drug responders and non-responders.

Generating this baseline is far from trivial. It involves capturing the proper granularity to characterize a cell type vs. a cell state, what observed differences are significant and how they vary over time, which insights can be transferred across data sets, and which across different diseases. We apply state-of-the-art machine learning and computational methods, extensive benchmarking and quality control, and ongoing research and development to get it right.

Our goal is to build the most comprehensive and accurate models possible. We’re continuously updating these models and the methodology used to generate them. We’re expanding the scope of the models to incorporate new features, such as temporal dynamics or additional molecular measurements. Finally, we’re expanding the reach of these models to create a common baseline across the drug development industry. We believe it’s the only way to move past the antiquated “one investigator-one dataset” model of drug discovery, into a new, better model enriched by multiple datasets coming from multiple patients, diseases, treatments and data types.

Share