-
New Challenges and Initiatives in Big Data Processing
As the quantity of available data increases and traditional processing approaches become cumbersome and costly, Analysis Group teams have utilized a novel approach: leveraging the graphic processing unit (GPU) cores in graphics cards for parallel processing. Managing Principal Lisa Pinheiro notes that this can “increase processing speeds by a factor of 20 to 100 in common, time-intensive applications.” She adds, “This not only cuts time and costs, it also offers new opportunities to explore more complex analytic methods and detailed interpretation of results.”
Methodologies Types of Models Used Because Examples of Applications Predictive Modeling and Machine Learning Algorithms Neural Networks
Random Forests
ClusteringPredicting outcomes by training an algorithmic model on subgroups of observations typically outperforms standard regression approaches when many variables affect the outcome in varying and nonlinear fashion Predicting drug adherence based on patient and drug characteristics; predicting ultimate drug success based on characteristics of the market and sponsoring
company; identifying cancerous pixels in medical imaging; predicting resource utilization or side effects based on patient characteristics and historyBootstrap GLM Models Calculating confidence intervals for nonlinear models requires simulations Calculating confidence intervals around estimates of cost savings or resource utilization associated with a treatment Cross-Validation All Models Cross-validation is used to select
models with the best predictive power -- i.e., those that perform best when applied to new dataCross-validating a patient classification system to ensure appropriate accuracy before implementation Simulation All Models Simulations are useful for rare-event modeling Simulating the future development of rare diseases or side effects