G+ Roundup 1: Links about Software and Such

G+ will disappear at the beginning of April. (Much like the British economy.) Despite its impending doom, I’ve been posting images and links, because it’s easy. As stated previously, I have an archive of (most of) my G+ posts but reposting them all will be a non-trivial amount of work. Some of these one-offs don’t merit oblivion, though. So this is the first in an irregular series of repost … posts.

First up, some computer-related bits. Most are a little technical, so be warned.

What causes Ruby memory bloat?

https://www.joyfulbikeshedding.com/blog/2019-03-14-what-causes-ruby-memory-bloat.html

The author examines why his Ruby application consumes ridiculous amounts of memory.

The full article is worth reading, but to summaraize:

The reason he uncovers isn’t a problem with the Ruby runtime but with glibc’s malloc()/free() implementation and Linux heap management. Most people “solve” the problem with jemalloc or the “magic” MALLOC_ARENA_MAX=2 environnment setting. The author’s painstaking digging uncovered a better solution: the obscure and badly documented function malloc_trim(). Apparently calling malloc_trim() during Ruby’s garbage collection routines magically frees memory back to Linux. It reduces memory consumption nearly as much as MALLOC_ARENA_MAX and according to one set of performance tests is 10% faster. (My hypothesis: the default setting of MALLOC_ARENA_MAX leads to better performance despite consuming more memory. Using malloc_trim() retains the performance boost but conserves memory elsewhere. The article notes that malloc_trim() appears to work in linear time, but by calling it frequently it spreads that time out, where it’s hidden among I/O waits and other delays.)

This discovery has implications for all Linux software that use a lot of heap memory. (Possibly applications on other systems that use the GNU C compiler and runtime.)

Good News! You Will Be Paying Money Forever And Never Own Anything

https://www.pointandclickbait.com/2019/03/except-noobs-online-amirite/

Ever notice how companies want to sell us digital “content” and online subscription services? comiXology, Google Docs, Hulu, Netflix streaming, etc.?

It’s called Application As A Service (AaaS), and it’s the latest ploy of Big Tech and Big Media to make us “rent” everything. They tried it with hardware licensing and non-standard parts (hello Apple). They tried it with physical media copy protection schemes (e.g. CSS for DVDs). They tried it with software that depends on continuous Internet access to verify your right to use it. Now, they’re putting everything on “The Cloud” and letting you buy or rent things you can’t back up. So if the vendor decdes you shoudn’t have something you bought (hello Amazon), hikes prices (hello Netflix), or goes bust (I’m looking at you, Nook), you’re screwed.

Thus endeth the rant.

Extended Lua Table Notation

Half a year ago I posted this:

While getting up to speed on Hugo I discovered Toml (Tom’s Obvious, Minimal Language), which is apparentl a response to configuration files in YAML (~~Yet Another~~ YAML Ain’t Markup Language) and JSON (JavaScript Object Notation).

Is it me, or is TOML neither obvious or minimal? It sorta resembles an INI file, except when it doesn’t. It supports INI-like syntax and nested maps ( {key = value , …`}``); only one is necessary, and the latter is clearer. The GitHub project includes the language spec and a few test examples, including one purposefully tortuous one. Why open yourself up to parsing a nightmare?

In response, I’ll present I’ve long thought about, which I’ll now call LTON (Lua Table Object Notation), which like JSON is a strict subset of an embedded language. Unlike JSON there aren’t double quotes everywhere. Observe:
foo = "bar";
baz = "blah";
list = { foo="foobar", 23, m=0, false }
I’m making the {} around the whole document optional since nearly everything is a table in Lua, . Literally. Global variables are just entries in a table called _G, and the key _G points back to that table.

Lua represents lists/arrays/whatever as tables with the keys 1, 2, 3, etc. (Lua’s table implementation stores these in an internal array.) The constructor syntax interprets a value not preceded by “key =” as the next value in the sequence.

If you want a non-identifier or non-string as a key, you’d use the syntax “['not an identifier'] = 0.0”.

For backward compatibility, “,” and “;” are interchangeable as separators between table elements. (Previous versions required a “;” to separate map-like elements and list-like elements.)

I’m debating whether this notional notation should also support Lua’s “long strings”, in which all text between [[ and ]] becomes a string.

If I do this (and I likely won’t, because I’m me) I’ll definitely support Lua-style comments (from -- to the end of line) and long comments (--[[ … ]]).

Yes, irritation with one config file format spawns another one, which someone else will take issue with.

It’s the Ciiiiiiirclle of Li-i-ife.

(Slight reformatting, editing, and link inlining.)

BTW, I just checked into the “nested tables” thing and I was wrong. An “inline table” must end at the end of the line, and the author recommends using the INI style for more than a few items.

Now I’m puttering around with ~~LTON~~ ELTN (Extended Lua Table Notation) again. Here’s a (revised and corrected) syntax, based on Lua 5.3 syntax.

document         ::= (stat)*
stat             ::=  ';' | var '=' exp | functioncall
var              ::= NAME
exp              ::= constant | functioncall | tableconstructor 
constant         ::= 'nil' | 'false' | 'true' | NUMERAL | STRING_LITERAL
functioncall     ::= funcname arg 
funcname         ::= NAME
arg              ::= tableconstructor | STRING_LITERAL
tableconstructor ::= '{' ( fieldlist )? '}'
fieldlist        ::= field ( fieldsep field )* ( fieldsep )?
field            ::= key '=' exp | exp
key              ::= '[' constant ']' | NAME
fieldsep         ::= ',' | ';'

For reference, here’s the notation I’m using:

lowercase: A parser rule, defined elsewhere in the grammar. document is the top level rule.
NAME, NUMERAL, and STRING_LITERAL: Lexical symbols described elsewhere in the Lua spec; pretty much what they say.
' … ': A lexical symbol defined as the exact sequence between the single quotes.
… | …: A separator between alternatives, e.g. key ::= '[' constant ']' | NAME means that a key may either be a constant between square brackets or a NAME (identifier).
( … )?: Zero or one occurence.
( … )*: Zero or more occurences.

The syntax also has the constraints that each var can only appear once, and each key can only appear once in a fieldlist.

Also functioncall wasn’t in the original proposal, and I’m still debating it. The host program defines all functions and provides them to the parser. It’s intended as an extension mechanism to allow the following:

Converting a string or table to a more convenient internal representation. E.g. date/time strings to datetime objects.
Implementing references, especially cyclic and forward references. (This might require extra info in the function prototype.)
Configuring multiple instances of the same or similar things, like an old school Spring XML file but without the XML.
Interpreting the same data file in different ways, depending on application. An example in an early version of Programming in Lua used a series of entry {…} calls as a data file which, depending on the definition of entry in the program loading the file could upload them to a database, index the data, count the occurances of specific values, or something else, maybe all at once. The file of entry calls was a valid Lua program. As we’ve seen with AJAX, though, running and executing arbitrary JavaScript is a huge security hole, which is why we have JSON.

Complaints Department

Otherwise the G+ articles in the Software and Such collection seem to be mainly complaints:

`lua-cjson`

Slightly annoyed that lua-cjson breaks under Lua 5.3 because the C API renamed a function. Whether I’m annoyed with the module maintainer who didn’t catch the change, the Lua developers for not adding backward compatibility, or me for not seeing some obvious fix apart from editing the code by hand I’m not sure.

Oh, well, not that I’m coding anything important.

P.S. Is it just me or does Lua need an equivalent to Ruby’s bundler? I.e. a way to pull all an application’s dependencies into a local directory, which in Lua’s case probably means compiling and maybe statically linking a fair amount of C code. luarocks already provides an equivalent to Ruby gems …

(links added)

Clear All Cookies

It amazes me how many Web site problems are actually solved by “clear all your cookies”.

For example, I meant to change my e-mail address on a site but through a complex sequence of events created a whole second account. The support staff kindly merged the two, with my new address. Yet when I looked at past orders I didn’t see the ones from the old account. The solution? “Clear your cookies”. Log back in, and there they were.

Jumping Orlanth on a pogo stick. What’s in these cookies? Do they just cache whole freaking database query results on my machine?

How To Write Code That Will Last Forever

https://levelup.gitconnected.com/how-to-write-code-that-will-last-forever-f8c4b1c0c867

The title grabbed my attention because after eighteen years writing code I realized it never lasts forever. Unless, as the tag line says, “everyone is afraid to touch it” … in which case someone is (or should be) planning to replace it completely.

Like many Medium articles, it’s a thinly disguised advert for the author’s product and/or obsession. But the “general framework” strategy (a.k.a. “mechanism not policy”) has one fatal flaw. To quote a man older and wiser than me whose name I don’t remember, “for something to be reusable it must first be usable”. Too many projects, including some I’ve been involved with, have focused so much on being “general” and “flexible” they fail to solve the original problem, or else do it badly. And then they fail, and this Grand Architecture For The Twenty-Second Century goes into the bit bucket. (Right next to the Big Ball of Mud that actually did something for a while.)

HTML, the author’s example, began as a way to solve a specific problem – distribute tech papers around CERN – and evolved greatly over time. Parsers for Berners-Lee’s HTML would crash on today’s HTML. Hell, for most of Web history two different browsers written at the same time would render the same HTML differently. (And let’s not get started on the JavaScript DOM.) HTML evolved through a complex and sometimes contentious standards process. It wasn’t “one and done”. People remember Tim Berners-Lee as the creator of the Web (except on Twitter), but his code is only a historical curiosity. What he built and what we now have are worlds apart.

Data lasts forever, or at least until people need it and can use it. Programs are Potemkin villages, erected just long enough to serve the needs of their patrons and then torn down to make way for something grander. (Not necessarily better.)

Coding For A Serverless Future

https://headmelted.com/coding-for-a-serverless-future-f34ae86c6c2

Call me an old curmudgeon, but as interesting as “serverless architectures” are they have the same flaws that make me leery of “The Cloud”

They’re NOT serverless. The servers exist somewhere on the Internet, managed by somebody else. Economies of scale make this cost-effective for all but the largest of organizations. But somebody else – who regards you as just another customer – is managing a core piece of your business. Remember that.

Your business depends moment-to-moment on somebody else. Heroku, AWS, and the rest may be cheap and reliable now, but they or the big media companies that buy them in the future may decide to jack up prices or blow off smaller customers in favor of big spenders.

As architecturally advantageous as it may be to decompose operations among server processes, if any of these applications servers fail your Web application is crippled. Best case: the Web application fails gracefully and only one or two functions become temporarily unavailable. Worst case (when foresight and error handling fail) your entire application not only breaks but loses consumer information … and trust.

As with ASPs, JSPs, and similar mixes of HTML and code, teams under time pressure may put “business logic” in with presentation and error-handling code. If your Web application is more than just one page, you’ll have to factor all that into a single JavaScript library that your developers use religiously … or content yourself with cut-and-paste and inconsistencies among pages.

The first law of distributed computing is DON’T. In our highly connected world that’s virtually (hah!) impossible to do. In a time where everybody must cut costs to the bone just to survive, Serverless Architecture provides a way to deliver distributed services without huge teams and a huge investment in hardware, networks, and system administration. But, like all advances in computer science, it has trade-offs that breathless articles like this one gloss over.

Serverless architectures still require design, testing, and clean coding practices. In fact, as potential failure modes increase the need is even greater.

All AaaS (Application as a Service) architectures make patching bugs in the field as simple as uploading the patched code onto the server(s). But that means maintaining discipline in fixing and testing bugs and planning the roll-out so users don’t encounter a random hybrid of old and new elements. With servers spread across multiple service providers the risk becomes even greater.

Nothing removes the essential complexity of a problem, by definition. Even if serverless architectures remove the problems of managing the required infrastructure, they introduce additional, accidental complexity of their own.

The older I get, the more I realize that computer science doesn’t progress in a straight line but as a spiral: the same ideas over and over with only incremental advances in capability. The hype around serverless architectures resembles the excitement over distributed client-server architectures in the 1980s and early 1990s and then the proliferation of Web applications in the dot-com era. First we had CORBA, then gradually re-invented it with angle brackets as SOAP. I don’t know what new standards and practices serverless architectures will require to improve stability, availability, security, interoperability, versioning, and so forth … but those problems and others will emerge in a few years. And a decade or two after that, the industry will turn its back on serverless architectures for the next bright idea.

Serverless architectures aren’t the mythical silver bullet, just another tool. Always use the right tool for the job. And never let vendors sell you a big shiny clumsy tool when a humbler hand-made tool will do.

Postscript: I don’t think I really understood how Serverless Architectures worked when I wrote this. Now it reads like a mashup of “Old Man Yells At Cloud” and Roseanne Rosannadanna. But I still stand by more general comments about software development.

… and the rest

I’m skipping some other rants about WEBP files, Chrome memory usage, the Intel Management Engine, people who don’t unit test their software, people who only unit test their software, Agile Software Development being “sold” as a silver bullet, not using Agile techniques sensibly, the Ruby Markdown parser kramdown (which is actually OK), and the SEI CERT C Coding Standard (because it’s hella long). You can thank me later.