Where things can go wrong
Solutions and mitigators
Reproducibility crisis. The wiki page gives a decent overview to the problem and surveys the major angles for addressing this issue.
The fundamental observation is that many studies in the literature are not reproducible by third parties.
Some disagreement as to the severity of the problem or how to deal with the problem.
What can we do as quantitative life sciencists (data scientists, bioinformaticians, computational biologists)?
There are examples of clear dishonesty in life science research.
However issues of reproducibility (or lack thereof) are more insidious.
Simple clerical mistakes
Poor or incomplete description of the result
Poor or incomplete description of method
Problems (eg technical or selection bias) in profiling
Inaccessible supporting data incl. training set
Improper use of training/validation dataset
Bugs in computer code
Lack of statistical power in the study
Improper or naive use of statistics (eg pvalues) Why most published research findings are false, Ioannidis
Society and human nature: competition, time constraints, poverty, acknowledegment
Other?
Bioinformatics’ mandatesinclude development of ethical guidelines, standards and education.
Data science contributes methods for better expressing our results.
Computational biology continues to improve methodology and integration with solid statistical foundations.
For example,
Problem Poor or incomplete description of the result
Problem Poor or incomplete description of method
Mitigation Distill A new way of publishing?
Mitigation Distill for R
© M Hallett, 2020 Concordia University