diff options
author | Will Woods <wwoods@redhat.com> | 2016-01-15 15:58:00 -0500 |
---|---|---|
committer | Will Woods <wwoods@redhat.com> | 2016-01-15 15:58:00 -0500 |
commit | defc5dc9ff436b8a996c3f4f7ad00b358862cc36 (patch) | |
tree | b8b8ff9e360d8bbdb0880d70229292f97979e0ab | |
download | weld-docs-defc5dc9ff436b8a996c3f4f7ad00b358862cc36.tar.gz weld-docs-defc5dc9ff436b8a996c3f4f7ad00b358862cc36.tar.xz weld-docs-defc5dc9ff436b8a996c3f4f7ad00b358862cc36.zip |
initial commit
The initial brain-dump / design document thing.
-rw-r--r-- | weld-design.md | 272 |
1 files changed, 272 insertions, 0 deletions
diff --git a/weld-design.md b/weld-design.md new file mode 100644 index 0000000..658e38c --- /dev/null +++ b/weld-design.md @@ -0,0 +1,272 @@ +# WELD MASTER DESIGN DOCUMENT: SUPER-EARLY DRAFT v0.2 + +This is an experimental design for a Linux distribution. +For the moment I'm calling it `weld`, for `W`ill's `E`xperimental `L`inux +`D`istribution. + +Send questions/comments/suggestions to <wwoods@redhat.com>. + +_Will Woods, Wed 19 Aug 2015_ + +## Terms used in this document + +### Objects: code, binaries, images, etc. + +* _Package_: a single upstream project, including branches (stable, unstable, + development, etc.) + * ex: `bash`, `glibc` +* _Source Release_: a single moment in a single branch of a Package's sources. + * ex: `bash-4.0.tar.gz`, a git tag +* _Build_: artifacts produced by building a given Source Release + * ex: binary RPM, `-doc` subpackages, `-devel` subpackages +* _Layer_: a logical set of Packages that provide a certain API/ABI. + * ex: comps group (kinda), plus some API/ABI guarantees and definitions +* _Image_: a set of built Layers, plus whatever metadata/modifications are + needed to make that image runnable in some context + * ex: EC2 images, `boot.iso`, Docker container images, etc. +* _Build Environment_: an Image that contains everything needed to Build a + given Source Release. + * ex: `mock` chroots +* _System_: a unique Image corresponding to a single logical machine. + This might be a generic Image with unique system-specific configuration + (e.g. host name, MAC address) overlaid on top, or a fully custom Image. + * ex: basically any virtual / contained / bare-metal system + +### People: users, audiences, roles + +* _Developers_: Write code, push to upstream source repo. Tag releases. +* _Packagers_: Integrate upstream source into the distribution. + Add / maintain Dependencies and other metadata and enforce Distributor + policy. + Decide when to pull/tag upstream changes/releases. Sometimes also Developers. +* _Release Engineers_: Compose and distribute Builds, Images, and other Objects. +* _QA_: Responsible for developing and running integration tests and functional + tests. + (Generally *not* responsible for unit tests; those are the Developer's + responsibility.) +* _Distributors_: Maintain the distribution as a whole; decide the contents of + the Layers/Images/Products, set policy about file names and system + capabilities. + (ex: Fedora, RHEL PM, corporate deployers) +* _Sysadmins_: deploy Images to create Systems. Need to be able to apply + hotfixes, or at least identify which deployments have problems. + (Also known as "users".) +* _ISVs_: Basically developer + packager; they want to be able to write their + code and provide it in a format that Sysadmins can apply to their Systems. +* _Customers_: The people who consume the Platform and Products we make. + Mostly Sysadmins, Distributors, and ISVs. + +### Tasks: what do people want to do with these objects? + +* _Task_: Something a User is interested in doing with some Object or Objects: + * Sysadmin: run binaries + * Packager/Release Engineer: build binaries + * Release Engineer: compose Images + * QA: run Integration Tests on an Image + * QA: run a package's Unit Tests + * [etc.] +* _Dependency_: a reference to an Object that is required to be present for + a certain Task to be performed. +* _Environment_: The system environment (set of objects/builds) where a Task + takes place. + * Derived from the Dependencies of the given Task + Source Release. + * The required Environment for each Task will vary wildly between + types of Tasks, even within the same Package / Source Release. + +## REQUIREMENTS + +### Minimum Viable Product requirements: + +* _Distributors_: set/apply policy about build output (`%{_docdir}` etc.) +* _Distributors_: set policy about post-build transformations (RPM `brp-*`) +* _Distributors_: define what Packages are in each Layer (`comps.xml`) +* _Packagers_: import new Source Releases of upstream Packages (`fedpkg new-sources`) +* _Packagers_: apply patches to upstream code (`Patch1:`) +* _Packagers_: add metadata about build requirements (`BuildRequires:`) +* _Packagers_: add metadata about runtime requirements (`Requires:`) +* _Packagers_: add metadata about version differences (`%changelog`, bodhi) +* _Packagers_: add metadata to mark conflicting Packages (`Conflicts:`) +* _Packagers_: add other metadata (e.g. crypto export info) +* _Packagers_: create a local Build from sources (`fedpkg local`) +* _Packagers_: tag source as ready for release (`fedpkg tag`) +* _Packagers_: check out the sources for a tagged Source Release (`fedpkg prep`) +* _Release Engineers_: create a Build Environment for a tagged Source Release (`mock` / Koji) +* _Release Engineers_: create a new Build inside a fresh Build Environment (`mock` / Koji) +* _Release Engineers_: build Images from a set of built Packages/Layers (`lorax`, `pungi`) +* _Release Engineers_: publish Builds/Images +* _Release Engineers_: sign Builds/Images (`sign_unsigned`, etc.) +* _Release Engineers_: create + publish metadata about signed Builds/Images (`createrepo`, `mash`) +* _Release Engineers_: produce source corresponding to any Build (`.src.rpm`) +* _Release Engineers_: build variant Images with different stacks (SCLs) +* _Sysadmins_: install Builds/Images to create a unique new System (`anaconda`, `yum install --installroot=...`) +* _Sysadmins_: determine which Source Releases are in a Build/Image (`rpmdb`) +* _Sysadmins_: find updated Builds/Images for existing Build/Image (`dnf`) +* _Sysadmins_: apply a new Build/Image to an existing Image/System (`dnf update`) +* _Distributors_: define new Layer/Image based on existing ones (`spin-kickstarts`, kinda) +* _Distributors_: make and publish RPMs for legacy consumers + +## WORKFLOW + +### Current model: turn the crank + +#### Packager + +* Make/fetch tarball of upstream source +* Upload tarball to cache +* Write/update `.spec`: + * Write `%prep` script to unpack sources + apply patches + * Write `%build` script to build sources + * Write `%install` script to install build artifacts + * Modify `%install` to meet distribution policy + * Update `%files` list to list installed files + * Add `%post`/`%posttrans` scripts if needed by package + * Write `%changelog` +* Add patches if needed: + * Commit patch to git + * Add `PatchX:` line to `.spec` + * Add `%patchX` line to `%prep` +* Apply `.spec` changes to each release branch +* Tag new `.spec` for each release +* Initiate builds for each release + * Build process: + * Generate Build Environment: + * recursively depsolve `BuildRequires` + * uncompress + install depsolved packages + * `%prep`: unpack tarball + apply patches + * `%build`: build source into binaries + * `%install`: install binaries inside output directory + * gather files listed in `%files` from output directory + * create compressed archive of files + * repeat for each platform +* File update requests for each release + * Choose one or more Builds + * Write update metadata + +#### Release Engineers + +* Push updates + * Update process (`bodhi`) + * Tag approved builds + * Depsolve approved builds and existing builds again (`mash`) + * Sign tagged packages (manual-ish by design) + * Make metadata for new builds +* Build Images for new releases (`pungi`, `lorax`, `livecd-creator`, etc.) + * Depsolving, again + * Uncompress + extract archives + * Run scriptlets for each archive + * Run extra scripts to turn output into proper Image + * Repeat for each Image + * Repeat for each platform + +[TODO: ISVs, Distributors, Sysadmins] + +## DESIGN PRINCIPLES + +### _data, not code_ + +* Static Analysis is a damn good idea and we should do more of it +* In other words: _no shell scripts unless **absolutely necessary**_ +* `%files`: distro-wide policy; described/enforced with `udev`-style rules + * `FILENAME=="*.so" FILEPATH=="*/lib" ATTR[library]:=1` + * `ATTR[library]==1 RUN[posttrans]+="ldconfig"` +* `%build`: _descriptive_ (not shell scripts!) + * `buildtype: autoconf` should be sufficient for most things! + +### Tradition isn't enough + +* Instead of working around problems, let's design better solutions +* Be bold, but not foolish + * Design solutions _for the people who will use them_. + * Do your research. Newer isn't always better for the task. + +### You don't have to please everyone + +* Make something that works great for you +* Make it easy for others to adapt to their needs +* You don't have to change your goals to match someone else's + +## GOALS + +[FIXME: finish categorizing the list of goal items] + +1. Make packaging and release-engineering easier + * Git-style workflows everywhere + * New package build: `git fetch upstream`, merge, push + * New package update: `git tag -s`, push + * New (test) compose: edit manifest and push + * New release: `git tag -s` and push +2. Better integration between Packages + * make it easy to check out the sources for an entire Layer + * package metadata is static data + * introspection and better tooling + * minimal boilerplate, fewer gnarly shell scripts + * importing from upstream should work like `git pull` + * tagging source as ready for release should work like `git tag` +3. Make builds faster and easier + * avoid repeated compress/decompress cycles + * avoid repeated `configure` checks + * simplify Build Environment creation + * cache Build Environments + * generate Builder Containers for EC2 &c. + * put builds into something that de-duplicates them (`ostree`-ish) +4. Make updates faster and more reliable + * Atomic, basically +5. Enable Distributors and ISVs to easily publish their own stuff + * remixing the distro is just a `git clone` away + * `git pull` for merging new changes, etc, etc. + +* _Release Engineers_: duplicated data inside Builds should not be stored twice (like `git`) +* _Sysadmins_: duplicated data inside Images should not be stored twice (like `git`) +* _Build Process_: avoid compressing Builds before publishing (allow for + de-duplication + skip repeated compress/uncompress) +* _Build Process_: don't re-run `configure` for every build +* _Release Engineers_: creating Build Environments should be fast +* _Release Engineers_: containerize Build Environments to build in The Cloud +* _Sysadmins_: updates can be applied atomically +* _Sysadmins_: updates can be easily rolled back +* _Sysadmins_: non-unique parts of a System are read-only by default +* _Distributors_: define Layers by moving per-package metadata files around a + git repo (`weld.git`) +* _Distributors_: modify Layers by cloning/branching `weld.git` +* _Sysadmins_: update metadata should be small and fast to download +* _Distributors_: run tests when there are new Source Releases/Layers +* Continuous Integration testing triggered for each push +* User-installable Builds/Layers +* TODO: Per-layer ABI/ABI/Service definitions +* TODO: Design upgrades into this thing +* TODO: ISVs target Layers (which have ABI/API guarantees) not + individual files/symbols + +## HOW DO WE GET THERE + +Piece by piece: + +* Rejigger dist-git into a Layer-based directory hierarchy + * _MAYBE_: each layer is a git repo, `dist-weld` just uses submodules? + * Or some other layering technique so ISVs/Distributors can add/replace.. + * Build Layer (meta-)packages + * Make Images by piling up Layers + * Simpler metadata: `lang/python`'s 2.7 branch just `Requires: core >= 22.0` +* Gradually redefine `.spec` to reduce manual work: + 1. Obsolete bash scripts in `.spec`, section-by-section: + * Obsolete `%prep`: use git repo instead of tarball + patches + * Obsolete `%build`: define rules that handle common build "styles" + (autoconf, cmake, etc) + * Obsolete `%install` similarly + * Obsolete `%files`: define rules that apply tags to installed files by + location or contents + * Obsolete `%post`/`%posttrans`: define rules that run scriptlets based on + tags applied to files + 2. Define new file format that can be "compiled" to generate a `.spec`, + replace `.spec` files altogether +* Deduplication / avoid recompression: + 0. dump build output into a big de-duplicating Content Store + 0. generate RPMs from Content Store + 0. Generate Images directly from Content Store + 0. _SOMEDAY_: Don't bother distributing RPMs; just distribute Content Store + * __XXX NOTE__: is this even feasible?? +* Build speed: + * Cache results of `configure` and skip running it + * We do not need to check for Ultrix 15,000 times. + +[FIXME: update for 2016!] |