summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorWill Woods <wwoods@redhat.com>2016-01-15 15:58:00 -0500
committerWill Woods <wwoods@redhat.com>2016-01-15 15:58:00 -0500
commitdefc5dc9ff436b8a996c3f4f7ad00b358862cc36 (patch)
treeb8b8ff9e360d8bbdb0880d70229292f97979e0ab
downloadweld-docs-defc5dc9ff436b8a996c3f4f7ad00b358862cc36.tar.gz
weld-docs-defc5dc9ff436b8a996c3f4f7ad00b358862cc36.tar.xz
weld-docs-defc5dc9ff436b8a996c3f4f7ad00b358862cc36.zip
initial commit
The initial brain-dump / design document thing.
-rw-r--r--weld-design.md272
1 files changed, 272 insertions, 0 deletions
diff --git a/weld-design.md b/weld-design.md
new file mode 100644
index 0000000..658e38c
--- /dev/null
+++ b/weld-design.md
@@ -0,0 +1,272 @@
+# WELD MASTER DESIGN DOCUMENT: SUPER-EARLY DRAFT v0.2
+
+This is an experimental design for a Linux distribution.
+For the moment I'm calling it `weld`, for `W`ill's `E`xperimental `L`inux
+`D`istribution.
+
+Send questions/comments/suggestions to <wwoods@redhat.com>.
+
+_Will Woods, Wed 19 Aug 2015_
+
+## Terms used in this document
+
+### Objects: code, binaries, images, etc.
+
+* _Package_: a single upstream project, including branches (stable, unstable,
+ development, etc.)
+ * ex: `bash`, `glibc`
+* _Source Release_: a single moment in a single branch of a Package's sources.
+ * ex: `bash-4.0.tar.gz`, a git tag
+* _Build_: artifacts produced by building a given Source Release
+ * ex: binary RPM, `-doc` subpackages, `-devel` subpackages
+* _Layer_: a logical set of Packages that provide a certain API/ABI.
+ * ex: comps group (kinda), plus some API/ABI guarantees and definitions
+* _Image_: a set of built Layers, plus whatever metadata/modifications are
+ needed to make that image runnable in some context
+ * ex: EC2 images, `boot.iso`, Docker container images, etc.
+* _Build Environment_: an Image that contains everything needed to Build a
+ given Source Release.
+ * ex: `mock` chroots
+* _System_: a unique Image corresponding to a single logical machine.
+ This might be a generic Image with unique system-specific configuration
+ (e.g. host name, MAC address) overlaid on top, or a fully custom Image.
+ * ex: basically any virtual / contained / bare-metal system
+
+### People: users, audiences, roles
+
+* _Developers_: Write code, push to upstream source repo. Tag releases.
+* _Packagers_: Integrate upstream source into the distribution.
+ Add / maintain Dependencies and other metadata and enforce Distributor
+ policy.
+ Decide when to pull/tag upstream changes/releases. Sometimes also Developers.
+* _Release Engineers_: Compose and distribute Builds, Images, and other Objects.
+* _QA_: Responsible for developing and running integration tests and functional
+ tests.
+ (Generally *not* responsible for unit tests; those are the Developer's
+ responsibility.)
+* _Distributors_: Maintain the distribution as a whole; decide the contents of
+ the Layers/Images/Products, set policy about file names and system
+ capabilities.
+ (ex: Fedora, RHEL PM, corporate deployers)
+* _Sysadmins_: deploy Images to create Systems. Need to be able to apply
+ hotfixes, or at least identify which deployments have problems.
+ (Also known as "users".)
+* _ISVs_: Basically developer + packager; they want to be able to write their
+ code and provide it in a format that Sysadmins can apply to their Systems.
+* _Customers_: The people who consume the Platform and Products we make.
+ Mostly Sysadmins, Distributors, and ISVs.
+
+### Tasks: what do people want to do with these objects?
+
+* _Task_: Something a User is interested in doing with some Object or Objects:
+ * Sysadmin: run binaries
+ * Packager/Release Engineer: build binaries
+ * Release Engineer: compose Images
+ * QA: run Integration Tests on an Image
+ * QA: run a package's Unit Tests
+ * [etc.]
+* _Dependency_: a reference to an Object that is required to be present for
+ a certain Task to be performed.
+* _Environment_: The system environment (set of objects/builds) where a Task
+ takes place.
+ * Derived from the Dependencies of the given Task + Source Release.
+ * The required Environment for each Task will vary wildly between
+ types of Tasks, even within the same Package / Source Release.
+
+## REQUIREMENTS
+
+### Minimum Viable Product requirements:
+
+* _Distributors_: set/apply policy about build output (`%{_docdir}` etc.)
+* _Distributors_: set policy about post-build transformations (RPM `brp-*`)
+* _Distributors_: define what Packages are in each Layer (`comps.xml`)
+* _Packagers_: import new Source Releases of upstream Packages (`fedpkg new-sources`)
+* _Packagers_: apply patches to upstream code (`Patch1:`)
+* _Packagers_: add metadata about build requirements (`BuildRequires:`)
+* _Packagers_: add metadata about runtime requirements (`Requires:`)
+* _Packagers_: add metadata about version differences (`%changelog`, bodhi)
+* _Packagers_: add metadata to mark conflicting Packages (`Conflicts:`)
+* _Packagers_: add other metadata (e.g. crypto export info)
+* _Packagers_: create a local Build from sources (`fedpkg local`)
+* _Packagers_: tag source as ready for release (`fedpkg tag`)
+* _Packagers_: check out the sources for a tagged Source Release (`fedpkg prep`)
+* _Release Engineers_: create a Build Environment for a tagged Source Release (`mock` / Koji)
+* _Release Engineers_: create a new Build inside a fresh Build Environment (`mock` / Koji)
+* _Release Engineers_: build Images from a set of built Packages/Layers (`lorax`, `pungi`)
+* _Release Engineers_: publish Builds/Images
+* _Release Engineers_: sign Builds/Images (`sign_unsigned`, etc.)
+* _Release Engineers_: create + publish metadata about signed Builds/Images (`createrepo`, `mash`)
+* _Release Engineers_: produce source corresponding to any Build (`.src.rpm`)
+* _Release Engineers_: build variant Images with different stacks (SCLs)
+* _Sysadmins_: install Builds/Images to create a unique new System (`anaconda`, `yum install --installroot=...`)
+* _Sysadmins_: determine which Source Releases are in a Build/Image (`rpmdb`)
+* _Sysadmins_: find updated Builds/Images for existing Build/Image (`dnf`)
+* _Sysadmins_: apply a new Build/Image to an existing Image/System (`dnf update`)
+* _Distributors_: define new Layer/Image based on existing ones (`spin-kickstarts`, kinda)
+* _Distributors_: make and publish RPMs for legacy consumers
+
+## WORKFLOW
+
+### Current model: turn the crank
+
+#### Packager
+
+* Make/fetch tarball of upstream source
+* Upload tarball to cache
+* Write/update `.spec`:
+ * Write `%prep` script to unpack sources + apply patches
+ * Write `%build` script to build sources
+ * Write `%install` script to install build artifacts
+ * Modify `%install` to meet distribution policy
+ * Update `%files` list to list installed files
+ * Add `%post`/`%posttrans` scripts if needed by package
+ * Write `%changelog`
+* Add patches if needed:
+ * Commit patch to git
+ * Add `PatchX:` line to `.spec`
+ * Add `%patchX` line to `%prep`
+* Apply `.spec` changes to each release branch
+* Tag new `.spec` for each release
+* Initiate builds for each release
+ * Build process:
+ * Generate Build Environment:
+ * recursively depsolve `BuildRequires`
+ * uncompress + install depsolved packages
+ * `%prep`: unpack tarball + apply patches
+ * `%build`: build source into binaries
+ * `%install`: install binaries inside output directory
+ * gather files listed in `%files` from output directory
+ * create compressed archive of files
+ * repeat for each platform
+* File update requests for each release
+ * Choose one or more Builds
+ * Write update metadata
+
+#### Release Engineers
+
+* Push updates
+ * Update process (`bodhi`)
+ * Tag approved builds
+ * Depsolve approved builds and existing builds again (`mash`)
+ * Sign tagged packages (manual-ish by design)
+ * Make metadata for new builds
+* Build Images for new releases (`pungi`, `lorax`, `livecd-creator`, etc.)
+ * Depsolving, again
+ * Uncompress + extract archives
+ * Run scriptlets for each archive
+ * Run extra scripts to turn output into proper Image
+ * Repeat for each Image
+ * Repeat for each platform
+
+[TODO: ISVs, Distributors, Sysadmins]
+
+## DESIGN PRINCIPLES
+
+### _data, not code_
+
+* Static Analysis is a damn good idea and we should do more of it
+* In other words: _no shell scripts unless **absolutely necessary**_
+* `%files`: distro-wide policy; described/enforced with `udev`-style rules
+ * `FILENAME=="*.so" FILEPATH=="*/lib" ATTR[library]:=1`
+ * `ATTR[library]==1 RUN[posttrans]+="ldconfig"`
+* `%build`: _descriptive_ (not shell scripts!)
+ * `buildtype: autoconf` should be sufficient for most things!
+
+### Tradition isn't enough
+
+* Instead of working around problems, let's design better solutions
+* Be bold, but not foolish
+ * Design solutions _for the people who will use them_.
+ * Do your research. Newer isn't always better for the task.
+
+### You don't have to please everyone
+
+* Make something that works great for you
+* Make it easy for others to adapt to their needs
+* You don't have to change your goals to match someone else's
+
+## GOALS
+
+[FIXME: finish categorizing the list of goal items]
+
+1. Make packaging and release-engineering easier
+ * Git-style workflows everywhere
+ * New package build: `git fetch upstream`, merge, push
+ * New package update: `git tag -s`, push
+ * New (test) compose: edit manifest and push
+ * New release: `git tag -s` and push
+2. Better integration between Packages
+ * make it easy to check out the sources for an entire Layer
+ * package metadata is static data
+ * introspection and better tooling
+ * minimal boilerplate, fewer gnarly shell scripts
+ * importing from upstream should work like `git pull`
+ * tagging source as ready for release should work like `git tag`
+3. Make builds faster and easier
+ * avoid repeated compress/decompress cycles
+ * avoid repeated `configure` checks
+ * simplify Build Environment creation
+ * cache Build Environments
+ * generate Builder Containers for EC2 &c.
+ * put builds into something that de-duplicates them (`ostree`-ish)
+4. Make updates faster and more reliable
+ * Atomic, basically
+5. Enable Distributors and ISVs to easily publish their own stuff
+ * remixing the distro is just a `git clone` away
+ * `git pull` for merging new changes, etc, etc.
+
+* _Release Engineers_: duplicated data inside Builds should not be stored twice (like `git`)
+* _Sysadmins_: duplicated data inside Images should not be stored twice (like `git`)
+* _Build Process_: avoid compressing Builds before publishing (allow for
+ de-duplication + skip repeated compress/uncompress)
+* _Build Process_: don't re-run `configure` for every build
+* _Release Engineers_: creating Build Environments should be fast
+* _Release Engineers_: containerize Build Environments to build in The Cloud
+* _Sysadmins_: updates can be applied atomically
+* _Sysadmins_: updates can be easily rolled back
+* _Sysadmins_: non-unique parts of a System are read-only by default
+* _Distributors_: define Layers by moving per-package metadata files around a
+ git repo (`weld.git`)
+* _Distributors_: modify Layers by cloning/branching `weld.git`
+* _Sysadmins_: update metadata should be small and fast to download
+* _Distributors_: run tests when there are new Source Releases/Layers
+* Continuous Integration testing triggered for each push
+* User-installable Builds/Layers
+* TODO: Per-layer ABI/ABI/Service definitions
+* TODO: Design upgrades into this thing
+* TODO: ISVs target Layers (which have ABI/API guarantees) not
+ individual files/symbols
+
+## HOW DO WE GET THERE
+
+Piece by piece:
+
+* Rejigger dist-git into a Layer-based directory hierarchy
+ * _MAYBE_: each layer is a git repo, `dist-weld` just uses submodules?
+ * Or some other layering technique so ISVs/Distributors can add/replace..
+ * Build Layer (meta-)packages
+ * Make Images by piling up Layers
+ * Simpler metadata: `lang/python`'s 2.7 branch just `Requires: core >= 22.0`
+* Gradually redefine `.spec` to reduce manual work:
+ 1. Obsolete bash scripts in `.spec`, section-by-section:
+ * Obsolete `%prep`: use git repo instead of tarball + patches
+ * Obsolete `%build`: define rules that handle common build "styles"
+ (autoconf, cmake, etc)
+ * Obsolete `%install` similarly
+ * Obsolete `%files`: define rules that apply tags to installed files by
+ location or contents
+ * Obsolete `%post`/`%posttrans`: define rules that run scriptlets based on
+ tags applied to files
+ 2. Define new file format that can be "compiled" to generate a `.spec`,
+ replace `.spec` files altogether
+* Deduplication / avoid recompression:
+ 0. dump build output into a big de-duplicating Content Store
+ 0. generate RPMs from Content Store
+ 0. Generate Images directly from Content Store
+ 0. _SOMEDAY_: Don't bother distributing RPMs; just distribute Content Store
+ * __XXX NOTE__: is this even feasible??
+* Build speed:
+ * Cache results of `configure` and skip running it
+ * We do not need to check for Ultrix 15,000 times.
+
+[FIXME: update for 2016!]