David | Bioinfo
Why ‘rm -rf’ is scarier than a pipette tip, and other truths of digital biology. Introduction: Hello, World (of Omics)
You can have the cleanest pipeline, the most parallelized code, and a server with 1TB of RAM. But if you don’t understand the biological question, you’re just moving bytes around.
Hi! I’m David. Ask me what I do, and you’ll get a different answer depending on the day. david bioinfo
As David the bioinformatician, my real value isn’t typing fast. It’s knowing when a result is biologically plausible vs. computationally correct but nonsense .
Sometimes, I’m a plumber (unclogging data pipelines). Sometimes, a detective (finding a single SNP in 3 billion base pairs). And once a month, I’m a philosopher (arguing whether a p-value of 0.051 is really non-significant). Why ‘rm -rf’ is scarier than a pipette
I found 10,000 variants. The lab expected 5. Did I mis-call indels? Is there a batch effect? Did someone accidentally use the mouse reference genome again? (It happened once. Once.)
So to my fellow Davids: keep one foot in the terminal and one foot in the literature. Validate your outliers. And for the love of all that is holy—. P.S. If you see me staring blankly at a scatter plot at 4 PM, I’m not stuck. I’m just visualizing principal components and questioning my career choices. 😉 As David the bioinformatician, my real value isn’t
bwa mem genome.fa sample_R1.fastq sample_R2.fastq > aligned.sam samtools sort -@8 aligned.sam -o sorted.bam freebayes -f genome.fa sorted.bam > variants.vcf Then I wait. This is when I practice patience. And refresh my email 47 times.