1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
|
Directly consuming the repo
---------------------------
After fetching local repo copy:
$ python setup.py build # or python2 if it matters
$ ln -fs __root__/build/ccs_flatten .
$ # or ./run-check that should take care on its own
and you should be ready to go with:
$ ./run-dev ...
High-level Concepts
-------------------
There are a few concepts that tries to unify effective conversion
recipes in a relatively loosely coupled way and with clear
relationships:
. command is a composition of one or more filters (see below)
with some level of flexibility (tee-like splits, etc.);
special case is a "command alias" that can dynamically select
existing underlying command based on the actual environment
(commonly as per the OS/distribution and/or available commands)
. filter is an isolated set of steps from object _A_ of a format _X_
(see below) and having the respective internal representation _x_
to the same or modified object _A'_ of a format _Y_ and having
the respective internal representation _y_ (possibly _X_=_Y_
and/or _x_=_y_, whereas string representation is frequented)
. format is something having possibly many more-or-less interchangeable
representations (string, parsed XML/ElementTree, ...), whereas
the possible format transitions depend on the available filters;
note that the format alone does not fully specify/restrict the set
of values/properties of objects being held as these are further
refined by previously used uni-format filters (context of
previous transitions plays the role)
. protocol denotes a form of internal representation interchange
(either internal in the filters' chain or user-requested IO),
with these special kinds:
- 'native' denotes most natural/economic form of dealing with
(the instance of) given data format, e.g., XML -> parsed tree
- 'composite' denotes multi-channel interchange format
(consisting of atomic ones)
Command
-------
Figuring out the options/arguments for use to enter at the command line
is heavily based on introspection of the function being decorated by
`Command.deco`. The explanation for the help screen is supposed to
be encoded directly in the respective docstring as per the existing
samples. For DRY, you can also defer to another function (to be called
before the body of the main command-driving one, likewise being passed
command context and the appropriate keyword arguments of all originally
passed at the function entry) defining the common arguments at the tail
+ utilizing them if suitable: this function is passed as `_common` (see
`commands.ccs2pcs.ccs2pcs_needle` +`filters.XMLFilter.command_common`).
Filter
------
Format
------
When defining a purpose-specific one, you can still utilize the inheritance
and basic format distillation routines (`file` -> `bytestring` or viceversa).
Even more, these are implicitly tried (in a backtracking manner,
general-to-specific) if you redefine the distillation method with
`chained=True`.
Command context
---------------
One can imagine this as a glue rolling the key-value pairs directly
to the commands (unrestricted modification, as a point of control for
a subsequent processing), in more limited sense to filters, and very
restrictively (read-only and only a few items) to the formats.
It is a declared or shared, agreement-based set of keys that should
always are expected to be present in the particular context of use.
Strictly reserved, intended for public use, are:
- system: lowercased system specifier (e.g., linux)
- system_extra: tuple of string values with details about the system
(depending on ``system'', distro name, release, codename,
etc.; e.g. ``('fedora', '20', 'heisenbug')'' for ``linux'')
Furthermore, some parts of this implicit context are opened up to implicit
XSLT processing of XMLFilter as parameters:
- system -> system
- system_extra -> system_1, system_2, system_3 ...
(3 are guaranteed to always exist)
Low-level Terms
---------------
tuplist tuple, list, or set (also see utils.py)
Notable points
--------------
1. plugins has to use absolute (clufter.*) imports otherwise issues can
occur in some contexts of use
2. please use only lower-cased identifiers for plugins, UPPER-CASED are
reserved for implicit ones (technically, not plugins, but can be used
on a few places accepting plugins) such as XML
3. use cli_{un,}decor in the code imposes several assumptions that rather
be followed (some are also imposed by lexical structure of a valid
Python identifier, which is also the main reason for this dichotomy):
- command-line options as declared in the docstrings as well as actual
names of function parameters should always stick with underscore
notation (although they will be presented with dashes in CLI)
- all plugin-identifying identifiers are forced to use underscore
notation, but again, CLI will present it with dashes; also
CommandAlias-backed functions should return string already
promoted to such dash notation
4. some limitations:
- one particular filter can only occur once in the command's filter chain
- all symbols in the module for particular plugin shall share the common
prefix (mutually and also with respective module), which is considered
till the first underscore (if any, in static view, dash in live-one)
x. to ease the perception of embedded XSLT snippets in the Python files,
there is respective _vimrc_local.vim provided (in filters/cluster)
so either run the contained sequence manually, or install lh-vim [1]
(or get its minimal subset via [2]) -- that is also why it is strongly
recommended to enclose such snippets like this so it has the right
effect (pipe symbol denotes the start of the line):
|foo = '''\
| [...]
|'''
[1] http://code.google.com/p/lh-vim/
[2] http://fedorapeople.org/cgit/jpokorny/public_git/vim4projects.git
Interactive/exploratory use: IPython and friends
------------------------------------------------
For top-level modules, following prologue can be used:
execfile('_go')
or alternatively add this command-line option to ipython invocation:
ipython --InteractiveShellApp.exec_files='["_go"]'
and from now on, one can use relative imports conveniently:
from .utils_prog import which
Common problems run into (just for self-reference)
--------------------------------------------------
(lambda a, *b: (a, b))(*'file') -> 'f', ('ile',)
(lambda a, *b: (a, b))(*('file', )) -> 'file', ()
'123' if True else True, None -> ('123', None)
'123' if True else (True, None) -> '123'
|