An overview of information flow analysis


Apol supports the ability to automate the search for overt information
flows between two types.  The purpose of this analysis is to identify
undesirable or unexpected flows of information allowed by a Type
Enforcement (TE) policy.  For example, imagine that the type shadow_t
is assigned to the shadow password file /etc/shadow.  To determine all
the types to which information can flow from the shadow_t type (e.g,
indicating possible paths for encrypted passwords to be
unintentionally leaked), do a "flow from" analysis on the shadow_t
type.  Another example might be a firewall application where the
intent is to understand all flows allowed between two network
interfaces.

Information flow analysis in SELinux is challenging for several
reasons, including:

  + The TE policy mechanism is extremely flexible, allowing for good
    and bad flows to be easily specified, not necessarily by the
    policy writer's intent.
    
  + TE policies tend to be complex, with possibly tens of thousands of
    rules and hundreds of types, making it difficult for a policy
    writer to know all that is allowed.
  
  + SELinux currently supports over 50 object classes and hundreds of
    object permissions, each of which must be examined with their
    ability to allow information flow from/to its associated object
    class.
    
The remainder of this file provides an overview on how apol performs
information flow analysis.


What Is Overt Information Flow In SELinux?
------------------------------------------
Information flow is defined in terms of access allowed (not
necessarily whether that access is actually used).  In SELinux, all
objects and subjects have an associated type.  Generally speaking,
subjects can read or write objects, and thereby cause information to
flow into and out of objects, and into and out of themselves.  For
example, given two types (say subject_t and object_t) and a subject
(with subject_t type) able to read, but not write, an object (with
object_t type), a rule that would allow this access might look like
the following:

        allow subject_t object_t : {file link_file} read;

This case would have the following direct information flows for the
types subject_t and object_t:

        subject_t: FROM object_t
        
        object_t:  TO subject_t
        
If this were the only rule relating to these two types, there would be
no other direct information flows from or to either.

An information flow can only occur when a subject is involved; a flow
directly between two objects cannot exist since a subject is required
to cause action.  In SELinux, processes are generally the subject.
There are currently 58 object classes (including processes, which are
both subjects and objects).

In apol, the subject is easy to recognize; any type that is used in
the 'source' field of an allow rule is presumed to be associated with
a subject, usually as the domain type of some process.  The object
type is the type used in the 'target' field of an allow rule.

In the case of objects, the allow rule also explicitly identifies the
object classes for which the rule applies.  This fact results in a
complication for analyzing information flows; specifically that flows
between types are restricted by object classes.  A flow between types
is typically not allowed for all object classes, but for only those
classes identified.  So to be more precise, the direct information
flows allowed by the object rules for object_t in the example above
are:

        object_t [file, link_file]:  TO subject_t 
        
A perspective difference exists between source (subject) types and
target (object) types.  A read permission between a source type and a
target type is a flow out of the target (which is being read) and flow
into the source (which, being a process, is receiving the data being
read into its memory).


Object permission mappings
--------------------------
The above examples used 'read' permission, but described flows as 'in'
or 'out' or 'from' and 'to'.  In general, for information flow
analysis, the only access between subjects and objects that are of
interest, are read and write.  Remembering the perspective difference
mentioned above, read and write access results in the following flow
for subjects (sources) and objects (targets):

        SUBJECT: READ:  IN flow
                 WRITE: OUT flow
                  
        OBJECT:  READ:  OUT flow
                 WRITE: IN flow
                
   NOTE: A process can be either a subject or an object, so when the
   process object class is specified in the allow rule, the target
   type is associated with process object class and the object flow
   rules apply.
        
Although read and write access are the only access rights of interest
for an information flow analysis, 'read' and 'write' permissions are
not the only SELinux permissions of interest.  The name of a
permission does not necessarily imply whether it allows read or write
access.  Indeed, to perform an information flow analysis requires
mapping all defined permissions for all object classes to read and
write access.

This mapping can be a difficult chore, and certainly requires
extensive understanding of the access allowed by each of the hundreds
of permissions currently defined.  For example, the file object class
has the 'getattr' permission defined that allows the ability to
determine information about a file (such as date created and size).
One could consider this a read access since the subject is reading
information about the file.  Then again this begins to feel like
COVERT information flow analysis, where one is concerned about illicit
signaling of information through non-traditional means (e.g.,
signaling the critical data by varying the size of file is a covert
flow, writing the data directly in the file so it can be read is an
overt flow).  This type of decision must be made for each defined
object permission for each defined object class.

The permission mapping mechanism in apol allows each permission to be
mapped to read, write, both or none.  In addition, the tool attempts
to 'fix' a permission map to fit the needs of the currently opened
policy.  So, for example, if a permission map file does not map a set
of permissions, or skips an entire object class, apol will label the
missing permissions to "unmapped" and treat them as if they were
mapped to 'none.'  Likewise, if a map has permissions that are
undefined in the current policy, it will ignore those mappings.  In
this way, apol continues its tradition of supporting old and new
versions of policies (see below for more on managing permission maps).

Apol provides mechanisms to manage and customize permission mappings
that best suit the analyst's needs.  Use the Tools menu (see below) to
modify permission mappings.


Permission weighting
--------------------
In addition to mapping each permission to read, write, both, or none,
it is possible to assign the permission a weight between 1 and 10 (the
default is 10).  Apol uses this weight to rate the importance of the
information flow this permission represents and allows the user to
make fine-grained distinctions between high-bandwidth, overt
information flows and low-bandwidth, or difficult to exploit, covert
information flows.  For example, the permissions "read" and "write" on
the file object could be given a weight of 10 because they are very
high-bandwidth information flows.  Additionally, the "use" permission
on the fd object (file descriptor) would probably be given a weight of
1 as it is a very low-bandwidth covert flow at best.  Note that a
permission might be important for access control, like fd use, but be
given a low weight for information flow because it cannot be used to
pass large amounts of information.

The default permission maps that are installed with apol have weights
assigned for all of the permissions. The weights are in four general
categories as follows:

  1 - 2     difficult to exploit covert flows (example: fd:use)
  3 - 5     less difficult to exploit covert flows
            (example: process:signal)
  6 - 7     difficult to use, noisy, or low-bandwidth overt flows
            (example: file:setattr)
  8 - 10    high-bandwidth overt flows (example: file:write)

These categories are loosely defined and the placement of permissions
into these categories is subjective.  Additional work needs to be done
to verify the accuracy of both the mappings of the permissions and the
assigned weights.

These weights are used in transitive information flow analysis to rank
the results and to make certain that important paths between types are
presented first.  For example, consider a policy with the following
information flows:

        allow one_t two_t : file write;
        allow three_t two_t : file read;
        allow one_t three_t : fd use;

If the permissions were mapped as described above and an analysis of
the transitive flows from one_t to three_t were done, the analysis
would return the path one_t->two_t->three_t first because the read and
write permissions have a much higher weight.  The direct flow between
one_t and three_t would still be returned by the find more flows, but
it would appear later in the list of flows.


Types of information flow analysis
----------------------------------
The examples so far have only looked at 'direct' information flows.
As its name implies, direct information flow analysis examines a
policy for information flows that are directly permitted by one or
more allow rules.  In essence, every allow rule defines a direct
information flow between the source and target types (for those
allowed permissions that map to read, write, or both).  The direct
information flow analysis automates the search for these direct flows.

Transitive information flow analysis attempts to link together a
series of direct information flows to find an indirect path in which
information can flow between two types.  The results for a transitive
closure will show one or more steps in the chain required for
information to flow between the start and end types.  Currently, the
results will only show one such path for each end type; specifically
the shortest path.

For example, given the following rules:

        allow one_t two_t : file write;
        allow three_t two_t: file read;
        
A direct flow analysis between one_t and three_t would not show any
flows since no rule explicitly allows access between them.  However, a
two-step flow exists that would allow flow between these two types,
namely one_t writing information into a file type (two_t) that three_t
can read.  These are the types of flows that the transitive analysis
attempts to find.

For both analyses, the results are presented in a less-than-desirable
tree form (a more natural form might be a graph presentation;
presently we are not prepared for that type of investment into the
GUI).  Each node in the tree represents a flow (in the direction
selected) between the type of the parent node and the type of the
node.  The results window shows each step of the flow including the
contributing access rule(s).


Managing permission mappings
----------------------------
The ability to directly manage permission maps is important for the 
following reasons: 

  + Permission maps are central to analyzing information flows, and
    the correctness of the map has a direct influence on the value of
    the results.

  + The mapping for individual permissions and object classes are
    subjective, and changing permissions to alter the analysis might
    be necessary (e.g., by unmapping certain object classes to remove
    them from the analysis).

  + The analyst may be working with several different policies each
    with different definitions of object classes and permissions.

    
Because of these reasons, apol was designed to provide great latitude
in managing permission mappings using Tools menu.  A user need not
manage permission maps directly; apol is installed with default
permission maps (typically in /usr/local/share/setools-<version>/)
that will be loaded automatically when an information flow analysis is
performed.

Use the Tools menu to manually load a permission map from an arbitrary
file.  This capability allows the user to keep several versions of
permission map files, loading the correct one for a given analysis.

Although the user could view and modify mappings by editing a map file
directly, an easier (and less error-prone) approach is apol's perm map
viewer.  Select View Perm Map from the Tools menu to display all
object classes and permissions currently mapped (or unmapped) in the
currently loaded policy.  In addition, each permission's weight value
is shown.  These values tell apol the importance of each permission to
the analysis.  The user can configure these weight values according to
the analysis goals.  For example, the user may consider any read or
write permissions of highest importance to the analysis, whereas
permission to use a file descriptor may be of least importance.  A
permission will default to a weight of 10 if a weight value is not
provided for the permission in the permission map.

A user has access to the "default" permission file.  If there exists a
file named .apol_perm_mapping in his home directory (i.e.,
$HOME/.apol_perm_mapping), then it is used when opening the default
file.  Otherwise the default file will be read from SETools's
installed location, typically /usr/local/share/setools-<version>.  The
file .apol_perm_mapping is always used as the destination when saving
to the default permission file.

   NOTE: Only one permission map may be opened at a time, and only
   when a policy is already opened.  If apol has performed an
   information flow analysis, the default permission map will be
   loaded automatically unless a permission map was previously loaded.
   Closing the policy will also close any existing permission mapping.
   Unsaved changes will be lost.


Finding more flows
------------------
For a transitive information flow, there might be many different
information flows between two types.  For example, given the
following policy:

        allow one_t two_t : file write;
        allow three_t two_t: file read;
        allow four_t two_t: file read;
        allow four_t three_t: file write;

In this policy, two ways exist that information can flow between one_t
and three_t: through three_t and through three_t and four_t. In
complicated policies, many information flows between two types can
exist, but the initial transitive information flow analysis might not
find all of them.  For example, apol might only find the flow through
three_t and four_t initially in the policy above.  Apol provides the
means to find more information flows between two types after the
inital analysis is completed.  In the results display for an end type,
there is a link labeled "Find More Flows."  Clicking on the link will
bring up a dialog box that allows the user to set a maximum time
duration and a maximum number of flows.  Finding all of the paths
between two types could take a significant amount of time for a
complicated policy, so this dialog provides the means to set limits on
the search. The search will stop when either of the limits are met.
After the search completes, the additional paths will be displayed in
the same results tab.  Note that if a large number of flows are found
it may take the display several seconds to render the text.