Data Management

How Lego can improve your research, or what everyone should understand about data documentation, metadata and making research reproducible.

The workshop will provide insights into the importance of data documentation. You [the participants] will learn more about metadata and how best to describe scientific data with regard to types, topics and contents. You will have opportunity to discuss aspects of data management such as searching and reusing data, viable file formats, documentation of code and the FAIR data sharing principles (findable, accessible, interoperable and reusable).

Introduction to Nextflow

Nextflow is an open-source workflow manager that helps to analyze large datasets in a portable and reproducible manner. It is an easy way to parallelize and connect commonly available bioinformatic tools to build reusable pipelines. This is a workshop for those who are completely new to Nextflow and follows the “Hello Nextflow” training designed by Seqera. Some basic familiarity with the command line, and common file formats is assumed.

Objectives

In this workshop, you will learn foundational concepts for building pipelines using Nextflow. By the end of this workshop you will be able to:

Describe and utilize core Nextflow components sufficient to build a simple multi-step workflow
Describe concepts such as operators and channel factories
Launch a Nextflow workflow locally
Find and interpret outputs (results) and log files generated by Nextflow
Troubleshoot basic issues

Prerequisites

A GitHub account
Experience with command line

Improve your skills in how to Visualize your Science

This workshop would be given by Andreas Dahlin - Founder of Visualize Your Science

"We all do excellent research that deserves to be presented in the best possible way. We are all good at writing and discussing our findings, but what about making diagrams, illustrations, and result charts? Using clear, crisp, and pedagogic images is one of the most powerful methods of explaining your research."

In this lecture, Andreas will share his knowledge of making scientific drawings and answer the following questions: 1. How do I draw? 2. What to draw? 3. How can I make my images scientifically credible? 4: How do I ensure my images communicate what I want them to communicate?

This workshop is aimed at researchers who want to share data science applications such as apps built with R Shiny, Plotly Dash, Gradio, Streamlit etc. publicly or with colleagues. Docker container images are a powerful tool for packaging and sharing applications and analyses widely used in both industry and academia. During the workshop we will first present the basics of Docker and how to build Docker container images. Building on this knowledge participants will then carry out hands-on exercises on their own laptops while we will be available to help. Finally, we will also demonstrate how to publish applications packaged as Docker images on SciLifeLab Serve (https://serve.scilifelab.se) and make them available on the web with a URL (for example, to be used in research papers or other output). SciLifeLab Serve is a service available free of charge to all life science researchers in Sweden. The participants are welcome to bring their own data science applications, and we can help package them on the spot.

Data Management

Introduction to Nextflow

Objectives

Prerequisites

Improve your skills in how to Visualize your Science

Packaging and sharing data science applications as Docker container images