Use a simple AWK command to locate the places in a list where the value of a data item suddenly changes.
Articles by Bob Mesibov
FASTA is a plain-text file format for DNA sequences, but the sequences are often wrapped to a fixed line length. This post explains 3 Linux command-line methods for joining the sequence lines end-to-end.
The trick is to make the documentation available on the CLI. Also, how to get a “yes” or “no” answer from grep.
How to use gnuplot to put data points on a basemap.
By mistake, I put two identical data items in the same field as a tandem repeat. Here’s how either sed or AWK could be used to split the repeat and put its parts into two different fields.
A “fileA-in-fileB” search is a search of fileB for lines that match any of the lines in fileA. I thought I’d post what happens when you systematically vary a fileA-in-fileB search. TL;DR: AWK wins.
I call them NITS, which is short for Nothing Interesting To Say. They’re the filler items that appear in spreadsheets and databases when the person entering the data has no information for a particular field.
How to check for format and content errors in YYYY-MM-DD fields with AWK.
Disagreements between fields in a database can be tricky to diagnose and even harder to detect.
Approximate or “fuzzy” matching on the command line is easily done with tre-agrep. Here’s a practical example.