VUMC Center for Quantitative Sciences: Summer Institute short courses for summer 2022- registration now open!
The Summer Institute is being offered by the Center for Quantitative Sciences this August, with early bird registration open through June 19 and regular enrollment open through July 25. Last year, the courses drew participants from around the world (live participation is advised, but recordings are made available through the end of August), and they’re an opportunity to engage with top Vanderbilt University researchers and instructors.
This year, a short course on Cloud Computing and Case Studies in Biomedical Data Science is being offered for the first time. It is intended to help researchers understand how best to deploy, scale, and manage costs related to applications in the cloud – topics that are more relevant than ever in this age of big data and complex scientific challenges.
More info and registration found here: https://www.vumc.org/cqs/cqs-summer-institute
FEES (PER COURSE)
$150 for VU/VUMC students/trainees/postdocs
$300 for VU/VUMC faculty or staff
$500 for non-VU/VUMC faculty, staff, students, trainees, or postdocs
Registration for week 1 courses is open through July 25. Registration for week 2 courses is open through August 1. Individuals registering during the early bird period (May 2 – June 19) receive a 20% discount. Enroll in each course by clicking on its title.
Please note that academic credit is not available for these courses.
VUMC Department of Biostatistics faculty/staff: please register via dbConnect. Students/trainees, consult your advisor on how to proceed.
Big Data in Biomedical Research · August 1–5 from 9 a.m. to noon (CT)
Instructors:
Yu Shyr, PhD, professor of biostatistics, biomedical informatics, and health policy
Qi Liu, PhD, associate professor of biostatistics and biomedical informatics
This course will explore statistical, bioinformatic, and computational methods and tools for analyzing big omics data in biomedical research, including experimental design for omics research, RNA-sequencing, single-cell RNA-sequencing, and statistical and bioinformatic methods in high-dimensional data. Students will gain practical experience with RNA-seq and single-cell RNA-seq analysis, including read mapping, quantification, differential expression, cell clustering, and marker gene identification, as well as performing functional interpretation of results.
Prerequisites: some prior experience with coding (e.g., R, Python, or Linux) will help you get the most out of this short course. Participants who would like a primer on biomedical terminology are encouraged to consult the Beginner’s Guide to RNA-seq or similar resources.
Introduction to Causal Inference · August 1–5 from 1 p.m. to 4 p.m. (CT)
Instructor: Andrew Spieker, PhD, assistant professor of biostatistics
Many are familiar with the phrase “correlation does not imply causation,” but that then begs the question: what exactly is causation in the first place? In this five-day short course, we will introduce fundamentals of causal inference approaches. The first three days will provide an overview of commonly implemented causal inference methods, including standardization, matching, inverse-weighting, and instrumental variables. The fourth day will focus on methods for longitudinal data, and the fifth day will address miscellaneous topics, including sensitivity analyses and causal inference with survival outcomes. Throughout the course, emphasis will be placed on graphical representation of variables through “directed acyclic graphs” (i.e., DAGs) and software implementation.
Prerequisites: some familiarity with basic statistics and/or interest in designing and evaluating clinical trials.
Regression and Modeling in R · August 8–12 from 9 a.m. to noon (CT)
Instructor: Fei Ye, PhD, associate professor of biostatistics and medicine
This course will cover advanced statistical topics frequently used in biological and medical research. Emphasis will be placed on practical applications of statistical methods and interpretation of the results. During this week, you will expand your understanding of the advantages and limitations of various methods, choose appropriate analytic approaches based on type of outcome variable and data structure, develop advanced statistical models in R, perform model diagnostic analyses, and interpret R output and analysis results.
Prerequisites: Biostatistics I or equivalent course(s). Students should be familiar with the basic notions and concepts of linear algebra and statistical modeling, types of variables (continuous, categorical, ordinal, etc.), common probability distributions (such as normal and binomial), and descriptive statistics, including summary statistics (mean, median, standard deviation, variance, etc.) and simple tests (t-test, Wilcoxon rank-sum test, chi-square test, etc.). Before the course starts, students not already familiar with R should practice data importing and simple programming with R and RStudio (both available as free downloads). An introduction to R that many of the previous students found helpful can be found at https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
Cloud Computing and Case Studies in Biomedical Data Science · August 8–12 from 1 p.m. to 4 p.m. (CT)
Instructors:
Yaomin Xu, PhD, assistant professor of biostatistics and bioinformatics; principal investigator, Translational Bioinformatics & Biostatistics Lab
Quanhu (Tiger) Sheng, PhD, assistant professor of biostatistics
Shilin Zhao, PhD, assistant professor of biostatistics
Brian Sharber, BS, lead cloud application developer (Bick Lab)
Alex Bick, MD, PhD, assistant professor of medicine; principal investigator, Bick Lab
With the unprecedented availability of big data, biomedical data scientists increasingly face novel challenges to properly and efficiently handle large-scale, high-volume data to solve complex scientific problems.
Cloud computing is a new generation of technologies and architectures, designed to deliver computer resources over the internet, aiming to economically analyze very large volumes of a wide variety of data by enabling the rapid provision of a large pool of shared computational tools on demand.
In this course, we will introduce cloud computing techniques and utilities in the context of biomedical research and help participants understand the deployment, scalability, and cost-efficiency of applications in the cloud. Topics will include: real-world case studies on implementing flexible and scalable data and project management; data structure for big data; high-throughput data analyses in the cloud using R; whole genome sequencing data analysis using WDL and Cromwell; machine learning in the cloud; the UK Biobank (UKB) and the UK biobank research analysis platform (RAP); analyzing and exploring UKB data using customized pipelines in RAP with WDL and CLI; and cloud computing in All of US and Terra.
This course is for beginner and intermediate students interested in applying cloud computing and big data systems to data science, machine learning, and data engineering. Students are expected to have beginner-level Linux, basic computing, and statistical data analysis skills in the field of biomedical data research.
QUESTIONS? Contact Jenny Jones. You may reach out to the instructors as well if you have specific course-related questions.