It is easy for wet lab scientists to keep a lab (real) notebook for their experiments. What are the recommended ways for bioinformaticians to keep track of their analyses?
Well, I describe in my lab notebook the analysis I used and what I wanted to obtain with it. When I use the result of one analysis as basis for all others, I put also this base result in the notebook.
my philosophy is nothing speaks more then code… so for each project i work on, i keep myself a readme file where i describe basically all commands and notes linked to that project analysis. Since commands are in this readme file, when its time to publish and write mat & methods, easy to know wich software version and what was done for that particular analysis.
Maybe not optimal but works fine with me.
This is good question. Will like to read what others do.
Our group use wiki page (Mediawiki) to track our projects, pipelines, …. For code and scripts, we use git.
I would suggest a couple of things to look into…
1/ Jupyter Notebooks — If you’re primarily dealing with Python code, then this is a great option. The “Notebooks” are accessible using your web browser and it supports markdown (for nice looking documentation). The notebook itself is also a python interpreter, so the code itself can be run inline with the supporting docs. I actually use this (as do many others) when i’m teaching python courses. You can see an example of what this looks like if you check this out.
And so, you can either download this locally as a part of the Anaconda package, or if you are working on a Compute Canada system, it’s already setup. Just have a look at the CC documentation page.
2/ If you’ve got some experience using git, then this also presents a good option, especially if the project is a collaborative one (i.e. multiple code contributors). Git additionally provides versioning and merging of code and documentation, albeit there’s a bit of a learning curve — but totally worth it!!! There are 2 general options to do this, the first being GitHub, which offers free (public-only) repositories, or BitBucket which offers free private repositories, but limits the number of users.
And so, both of these options are great for creating reproducible and well documented code. I would encourage your to check them out and see which of these options might work for you.
I strongly recommend Benchling (https://benchling.com/). You can upload files, create entries for each day, enter code, etc. Check it out!
For code and scripts, I also strongly advise using Github and Bitbucket, as noted above.
A combination of Rmarkdown and Git works well for me as much of my analyses are in R already and there are options for code chunks of other languages as well.