May 20, 2016
Adding support for bookmark files has prompted me to revisit how Blot generates title metadata from a file name.
Why does Blot try to do this? My guiding principle when designing Blot was this: Blot should be able to turn a single directory containing a single text file containing a single character into a blog. To meet this goal, Blot must generate metadata that other blogging platforms require you to specify (e.g. the publish date, the url).
Blot will try to first generate a title from the file’s contents. Ideally, the file would contain an <h1> tag in its first line. However, this is not always the case, and Blot uses a variety of approaches to extract a title. Those methods are for another blog post.
If Blot fails to find a good title inside the file, Blot tries instead to use its filename. Here’s a list of unit tests from Blot’s source which demonstrate how the title generator works:
// Preserve case
is('/fOo.txt', 'fOo');
// Replace dashes and underscores
// But only at start and end
is('/-f_o_o-.txt', 'f o o');
// Only replace dashes with spaces
// when file name has no spaces.
is('/2-1 Match report.txt', '2-1 Match report');
is('/2-1_Match_report.txt', '2-1 Match report');
// work without path
is('test.md', 'test');
// work with multiple dots
is('preview.html.txt', 'preview.html');
// extract date
is('2016/1/2 Bar.txt', 'Bar');
is('2016-1/2 Bar.txt', 'Bar');
is('/2016-1 2 Bar.txt', 'Bar');
// Ignore bad date
is('2-12-2000 Bar.txt', '2-12-2000 Bar');
is('/2000/34-23 Bar.txt', '34-23 Bar');
is('/11-1_Bar.txt', '11-1 Bar');
// Ensure title exists
is('___.jpg', '___');
is('---.jpg', '---');
is('-_-.jpg', '-_-');
is('.jpg', '.jpg');