This class focuses on the skills bioinformaticians use to explore and extract information from complex, large datasets. These data skills give you freedom; you’ll be able to look at any bioinformatics data (in any format, and files of any size) and begin exploring data to extract biological meaning.
Throughout the class, I will emphasize working in a robust and reproducible manner. Reproducibility means that your work can be repeated by other researchers and they can arrive at the same results. For this to be the case, your work must be well documented, and your methods, code, and data all need to be available so that other researchers have the materials to reproduce everything. If a workflow run on a different machine yields a different outcome, it is neither robust nor fully reproducible. These are themes that reappear throughout the class.
This class focuses primarily on handling tabular plain-text data formats. Tabular data is terrific for honing your data skills. Even if your goal is to analyze other types of data in the future, tabular data serves as great example data to learn with. Developing the text-processing skills necessary to work with tabular data will be applicable to working with many other data types. Thus, this class will teach you useful computational tools and data skills that will be helpful in your research.
Researchers from all disciplines will use Bioinformatics Data Skills to tackle problems in their fields (e.g., biology, language, music, socio-economic factors contributing to the COVID-19 pandemic, etc.).