Data Science Essentials Module Wraps Up a Successful First Year
This article was originally published in the 2018-2019 Annual Report.
The Vanderbilt ASPIRE Program piloted a new initiative in partnership with the Nashville Software School to enhance the readiness and confidence of biomedical trainees to pursue data science careers. Data Science Essentials combined project-based didactic exposure to data science with communication training and case sessions presented by data science professionals. Program participants developed a deeper understanding of data science career paths, increased their knowledge of data science tools and approaches, and enhanced their networking and communication skills. We anticipate that the resulting better-trained local workforce will attract partnerships with local industries and enhance career opportunities for our trainees in Nashville or elsewhere. This program was supported by a 2018 Career Guidance for Trainees award from the Burroughs Wellcome Fund.
By Iliza Butera and Mabel Seto, Graduate Students
Last August, over 40 graduate students and postdoctoral fellows at Vanderbilt answered the call for applications to a new ASPIRE module for biomedical trainees considering careers in the relatively new and booming field of data science. For the next two semesters, an initial cohort of 18 of these trainees (including us) took part in the three-section program.
The first section consisted of hands-on technical training in Python, which is a common coding language for data scientists. Nashville Software School instructor Mary van Valkenburg taught us the basics, beginning with setting up Jupyter Notebooks as a kind of virtual sandbox for testing out our newly-acquired Python skills.
By importing several public health data sets and a national census, we started to learn an important step of corralling data into the right structure—a skill aptly named data wrangling. We then utilized many of the powerful visualization tools from Python libraries and even machine learning for our final projects that asked questions such as ‘How well can we predict regional differences in cancer-related mortality using publicly available data sets?’
According to van Valkenburg, “The only way to really become fluent in coding for data science is to roll up your sleeves and try it.” When asked how she recommends that students continue to gain experience, she said, “There are huge amounts of online resources for open-source languages like R and Python, and there’s also a great community of data scientists here in Nashville. I always encourage my students to start side projects and join local meet-ups for collaborating and networking in the field.”
After learning these essential technical skills, the next section of the module was led by Ashley Brady, a co-author on the Burroughs Wellcome Fund grant that supported this pilot program and part of the leadership team in the BRET Office of Career Development. In planning the module, she says, “local data scientists emphasized to us that networking, communication, and storytelling were critical and often underappreciated components of a successful career in data science.” As a result, this second section of the module focused on sharpening these infrequently-taught but very important “soft skills” such as professional communication online, in casual settings like networking events or informational interviews, as well as during formal job interviews and business presentations.
Finally, this spring we all piled into a chartered mini-bus and took field trips across town in a series of case sessions with local businesses. Now seeing data science in action, this final section of the course was an excellent culmination of everything we had learned. From identifying patients with cancer, to helping solve the opioid epidemic and streamlining Medicaid applications, these case sessions gave us direct insight into what a data science career actually looks like.
For instance, data scientist and Vanderbilt alum Christi French, gave us a unique perspective on her path from bench science to data science, remarking that, “a training module like this absolutely would have accelerated my transition to data science after I finished my Ph.D.”
Brady echoes that, “finding patterns, building hypotheses, and solving problems are key components of both lab-based bench science and data science, and we want to provide our trainees with all the opportunities for a smooth transition between the two.”
Interestingly, of the seven companies we visited, six were located less than 20 miles from Vanderbilt’s campus. As the field continues to grow in Nashville and worldwide, it’s likely that we’ll be seeing many more graduates swapping out bench science for data science. For any others interested in these training opportunities, stay tuned for more info from the ASPIRE Program this fall, as plans are underway for a second cohort of future biomedical data scientists.