Big Data Research
Each research question we confront has specific data and analytics requirements. Our analytical experience and breadth of knowledge, refined over more than two decades, allow us to help our clients frame their research priorities and identify and access appropriate data sources. We have analytical expertise in numerous data sources, including administrative claims data, Phases III and IV clinical trials data, medical charts review data, and pricing and payer research.
Key components of the value we provide include:
- Knowledge of medical and economic data sources
- Access to proprietary data
- Experience managing large-scale data sets
- Insight into how best to overlay one dataset on another to yield new analytical opportunities
We have worked with and have access to a variety of US government data sources, including Medicare 5% and Medicaid data from eight states. We have also analyzed and have access to large data sources of privately held administrative claims data and data for more than 60 million employer-covered lives. Internationally, we have worked with electronic medical records and/or claims data in many countries, including the United States, China, India, the United Arab Emirates, Israel, Korea, Taiwan, and Japan.
We have found that linking data from disparate sources is key to developing comprehensive analyses. Our teams combine information on changes over time in patient clinical status, resource utilization, cost, and quality of life. Analysis Group has also collected and drawn on data to design and implement market research studies to help clients prioritize development opportunities, inform marketing strategy, support new product launches, and assess causal relationships and "what-if" scenarios.
Analyzing and modeling health care data also generates increasing practical challenges, as datasets grow larger and analyses more computer-intensive. Our data analysis workflow, from data extraction to final analyses, relies on our familiarity with relevant programming languages and development of specialized tools to tackle time-consuming data tasks. For example, we developed our own statistical libraries in CUDA-C to use GPU for parallel computing, allowing us to run millions of regressions in less than a minute.