summaryrefslogtreecommitdiffstats
path: root/weld-overview.md
blob: c4bf2a89d4a1d5994161bf286b68828783eaea21 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# WELD DESIGN OVERVIEW

_Super-early first draft edition_  
_Will Woods <wwoods@redhat.com>, 18 January 2016_

This is a high-level overview of `weld`, an experimental Linux distribution.

## Goals

Make a linux distribution that's **easy to work on**, **easy to customize**,
**quick to build or rebuild**, and **designed for the 21st century**.

### Easy to work on
* packager and developer operations should be `git`-like

### Easy to customize
* respins should just be a `git clone` + `git branch` away
* fewer fights over competing visions - fork and do what you want

### Quick to build or rebuild
* Rapid development and iteration allows better testing and faster fixes

### Designed for the 21st century
* Free of legacy cruft whenever it interferes with the above goals

## The super-simple overview:

1. Clone upstream packages into our own `git` mirrors of their repos
2. Build instructions + package metadata for all packages lives in a single
   `git` repo
  * this is all _metadata_, not _code_
3. Build tools make binaries and dump them into a big binary store
4. Image build tools take binaries out of the binary store and make images
  * Build environments are also images that get built out of the binary store

## What we actually need to _do_

### Metadata (`.spec` + `comps.xml`)

We need to maintain metadata about packages and their relationships,
functionally equivalent to `.spec` files.

1. Research and define our metadata format
  * What file format? (maybe [TOML]?)
  * What data is necessary? (BuildRequires, Requires, upstream URLs, etc.)
  * What's the schema? (required / optional keys and sections?)
1. Gather/convert metadata from existing distro

### Binary storage (`.rpm` + `repodata/`)

We want an efficient way to store our build output and its associated metadata.

1. Research and define what metadata we need attached to binaries
  * package version, build time, build environment ID, build flags, ...
  * basically all the `rpm` headers (that we actually care about)
1. Research efficient ways of storing versioned binary files
  * possibly `ostree`?
1. Research efficient ways of maintaining metadata alongside those files
  * Key-value files in the binary store?
  * Key-value pairs stored by `ostree` itself?
  * External key-value store?

Functionally equivalent to binary `.rpm` packages.

### Packaging tools (`.src.rpm`)

We need tools and policies for pulling upstream sources into a local mirror.

* Write tool that uses metadata to pull upstream sources into a local git repo
  * Everything in git, even if it's not git upstream!
  * This means we need tools to handle svn, hg, or even just tarball + patches
    * (looking at you here, [`bash`])
* Decide how exactly to use branches and tags in our git mirrors
  * `upstream` branch that tracks upstream `master`
  * `distro` branch where we put our patches
  * what about projects that have multiple branches upstream?
    * `upstream-[branch]` and/or `distro-[branch]`?
  * Tag `upstream` branches with the appropriate upstream versions
  * Distro patches go onto our `distro` branches
    * what about `master` tho? do we care?

### Package build tools (`rpmbuild`)

We'll need tools that use our metadata to build binaries for individual
packages.

1. Write tool to clone/check-out the source on our `distro` branch
1. Write tool to build binaries as per the metadata
1. Write tool to install/copy binaries into the Binary Store
  * Tag binaries with appropriate metadata

### Compose tools (`pungi`, `lorax`, `mock`, ...)

We need tools that pull binaries out of the Content Store to create system
images, containers, build environments, and eventually RPMs for legacy users.

## How do we get there?

If we replace only one part of the current system at a time - keeping the inputs
and outputs compatible with existing infrastructure - we can work on each
part in parallel and replace things piece by piece.

_TODO: staffing/hardware/time estimates_

#### Replacing source RPMs
* Write tool to pull code from upstream into a local `git` mirror
* Use this to generate tarball+patchset for existing `.spec`

#### Replacing `spec`
* Define our own metadata
* Write tool that generates a valid `spec` from our metadata

#### Replacing binary RPMs
* Research and design/implement/use a Binary Content Store
  * Take `rpm`s in - store files, metadata, signatures, etc.
  * Dump `rpm`s back out, with proper headers and signatures

#### Replacing compose tools
* Requires the Binary Content Store
* Write tools that pull data/metadata direct from the Binary Content Store

[TOML]: https://github.com/toml-lang/toml
[`bash`]: http://ftp.gnu.org/gnu/bash/