Biggest shell programs
https://github.com/oils-for-unix/oils/wiki/The-Biggest-Shell-Programs-in-the-WorldThe first really big one I wrote was the ~7000 line installer for the Enrust CA and directory, which ran on, well, all Unixes at that time. It didn't initially, of course, but it grew with customer demand.
The installation itself wasn't especially complicated, but upgrades were, a little, and this was back when every utility on every Unix had slight variations.
Much of the script was figuring out and managing those differences, much was error detection and recovery and rollback, some was a very primitive form of package and dependency management....
DEC's Unix (the other one, not Ultrix) was the most baffling. It took me days to realize that all command line utilities truncated their output at column width. Every single one. Over 30 years later and that one still stands out.
Every release of HP-UX had breaking changes, and we covered 6.5 to 11, IIRC. I barely remember Ultrix or the Novell one or Next, or Sequent. I do remember AIX as being weird but I don't remember why. And of course even Sun's three/four OS's had their differences (SunOS pre 4.1.3; 4.1.3; Solaris pre 2; and 2+) but they had great FMs. The best.
It's 6224 lines, so far.
There is a top-level binary with sub-functions, sort of like how
git [ git options ] < git action> [action options]
or systemctl [etc.[
work.There is a sub command to add a new sub command, which creates the necessary libraries and pre-populates function definitions from a template; the template includes short and long usage functions, so that
cbap -h
or cbap pipeline -h
give useful and reasonable advice.There are subcommands for manipulating base images, components (which are images with specific properties for use as containers in the pipelines), and pipelines themselves. A LOT of code is for testing, to make sure that the component and pipeline definitions are correctly formatted. (Pipelines are specified in something-almost-TOML, so there is code to parse toml, convert sections to arrays, etc., while components are specified as simple key=value files, so there is code to parse those, extract LHS and RHS, perform schema validation, etc.).
Since pipeline components can share properties, there is code to find common properties in var and etc files, specify component properties, etc.
There are a lot user and group and directory and FIFO manipulation functions tailored to the security requirements: When a pipeline is setup, users and groups and SEL types and MCS categories are generated and applied, then mapped into to the service files that start the components (so there is a lot of systemd manipulation as well).
Probably the single biggest set of calls are the functions that get/set component properties (which are really container properties) and allow us to use data-driven container definitions, with each property having a get function, a validation function, and an inline (in a pipeline) version, for maximum flexibility.
Finally, there is code that uses a lot of bash references to set variables either from files, the environment, or the command line, so that we can test rapidly.
It also support four levels of user, from maintainer (people who work on the code itself), developer (people who develop component definitions), integrators (people who build pipelines from components), and operators (people who install pipelines), with the ability to copy and package itself for export to users at any of those levels (there is a lot of data-driven, limited recursive stuff happening therein).
Since target systems can be any Linux, it uses makeself to package and extract itself.
For example, an integrator can create a pipeline definition, which will produce a makeself file that, when run on the target system, will create all users, groups, directories, FIFOs (the inter-component IPC), apply DAC and MAC, create systemd files, copy images to each user, and launch the pipeline - with a delete option to undo all of that.
There is some seccomp in there as well, but we've paused that as we need to find the right balance between allow- and deny- listing.
(Yes, I use shellcheck. Religiously. :->)
If it was up, it would do a speedcheck and record that for the IP the VPN was using, then check to see how that speed was compared to the average, with a standard deviation and z-score. It would then calculate how long it should wait before it recycled the VPN client. Slow VPN endpoints would cycle quicker, faster ones would wait longer to cycle. Speeds outsize a standard deviation or so would check quicker than the last delta, within 1 Z would expand the delta before it checked again.
Another one about that size would, based on current time, scrape the local weather and sunup/sundown times for my lat/long, and determine how long to wait before turning on an outdoor hose, and for how long to run it via X10 with a switch on the laptop that was using a serial port to hook into the X10 devices. The hose was attached to a sprinkler on my roof which would spray down the roof to cool it off. Hotter (and sunnier) weather would run longer and wait shorter, and vice versa. I live in the US South where shedding those BTUs via evaporation did make a difference in my air conditioning power use.
27k lines/24k loc
It began:
# Yeah yeah I know