Teaching basic lab skills
for research computing

CAB-Alliance Bioinformatics Workshop in Franceville, Gabon

Over the past five years, researchers from the University of New Orleans (UNO) in the USA and the Université des Sciences et Techniques de Masuku (USTM), Franceville, Gabon and University of Buea (UB) - have been working closely together as part of a larger collaborative initiative to map patterns of genomic and phenotypic variation in rainforest species across central Africa. This international partnership known as the Central African Biodiversity (CAB) Alliance is made up of a number of institutions in the US, Africa and Europe and is primarily funded through the National Science Foundation’s Partnership in International Research and Education program.

The project’s main goal is to identify areas across central Africa where turnover in genomic and phenotypic diversity in rainforest species is greatest since these are the areas where we expect the capacity for rainforest species to adapt to climate change to be the greatest.

Working with project partners at the University of California Los Angeles and Drexel University, the group began using next generation sequencing (NGS) to understand some of the environmental drivers of genomic variation within species and how these relationships might change under future climate projections. Once the data analysis workflow was finalized, researchers wanted to find a way for all project partner organizations to engage in the processing and analysis of the genomic data together. The natural next step was to run a bioinformatics workshop introducing NGS technology and data analysis to researchers and students at USTM as well as the scientists at the Research Institute for Tropical Ecology (IRET) and the International Centre for Medical Research in Franceville (CIRMF). The datasets that were used came from two of the three focal species that the team had based their work on: namely the African puddle frog Phrynobatrachus auritus and the soft-furred mouse Praomys missonei.

Organisers and instructors of the Gabon workshop

The UNO team (Nicola Anthony, Katy Morgan, Courtney Miller) along with colleagues at USTM (Patrick Mickala, Stephan Ntie, Jean-Francois Mboumba) and UB (Eric Fokam, Geraud Tasse) set out to develop a week long course that would introduce students to working in a linux environment and aid in building regional capacity in bioinformatics.

In March 2017 Jason Williams introduced Nicola Anthony to the South African Carpentry community to share some lessons learned about running computing workshops in Africa. During preliminary conversations we introduced the UNO/USTM teams to principles of the Software and Data Carpentry workshops such as live coding, sticky notes for getting help or indicating progress, and a variety of other practical things.

The workshop ran from 3 - 8 July 2017 and it was a great privilege that one of our South African instructors, Samar Elsheikh, was able to join the team in Franceville for the entire event. Topics that we covered included: data organization and spreadsheets, working in the command line, data visualization in R, next generation sequencing methods, processing restriction site associated DNA (RAD) sequencing, detecting loci under selection and geospatial modeling of genomic variation. A more detailed copy of the program is available here: DOI.

Gabon workshop

There were approximately 25 participants including both instructors and students. Participants were a mixture of faculty, research scientists and graduate students from several institutions in Gabon (USTM, Centre National de Recherche Scientifique et Technologique and the Centre International de Recherche Médicale de Franceville), Cameroon (UB), South Africa (University of Cape Town) and the US (UNO). A survey was distributed to all participants after the workshop ended to get feedback on the structure and content of the workshop itself.

Most of the participants used their own laptops, however, ten laptops were provided by CAB-Alliance with software and programs installed within VirtualBox. We also took a tour of the CyVerse platform including genomic data processing and cloud computing in the Discovery Environment and through ATMOSPHERE. Unfortunately, USTM does not currently have the infrastructure for Internet access, so four 4G mobile wifi hotspots were provided. Connectivity was still a challenge, but students were able to access cloud computing sites and other resources online.

One or two things that worked really well

  • VirtualBox Application - A virtual machine that included a linux operating system, all of the softwares and programs, as well as all of the data and results files was used to streamline the workshop.
  • Feedback from our participants indicated that, in addition to learning new techniques, taking time for interpretation of data and results was very helpful.

One or two things that you would do differently

  • Reducing the amount of material or sections of the workshop would have been beneficial.
  • It is always a challenge to balance presentations and explanations of concepts with hands-on practical exercises so it might be better in the future to reduce the amount of material and any unnecessary presentations. Feedback from our participants indicated that they would have preferred less lecture-based information and more time to practice the new methods they had just learned.
  • Many participants had difficulty creating an account on Cyverse and registering for an account in ATMOSPHERE. Finding a way to greatly simplify the registration process would be very beneficial.

What happens next in Gabon?

One way to move forward would be to create an on-line learning community for participants to continue working together and developing their UNIX skills. The Moodle site could be used for this but perhaps there are better ways to facilitate continued exchange between instructors and participants.


Dialogue & Discussion

You can review our commenting policy here.