1 files changed, 1011 insertions, 0 deletions
diff --git a/doc/rfc/rfc4648.txt b/doc/rfc/rfc4648.txt
new file mode 100644
index 0000000..c7599b4
--- /dev/null
+++ b/doc/rfc/rfc4648.txt
@@ -0,0 +1,1011 @@
+
+
+
+
+
+
+Network Working Group                                       S. Josefsson
+Request for Comments: 4648                                           SJD
+Obsoletes: 3548                                             October 2006
+Category: Standards Track
+
+
+             The Base16, Base32, and Base64 Data Encodings
+
+Status of This Memo
+
+   This document specifies an Internet standards track protocol for the
+   Internet community, and requests discussion and suggestions for
+   improvements.  Please refer to the current edition of the "Internet
+   Official Protocol Standards" (STD 1) for the standardization state
+   and status of this protocol.  Distribution of this memo is unlimited.
+
+Copyright Notice
+
+   Copyright (C) The Internet Society (2006).
+
+Abstract
+
+   This document describes the commonly used base 64, base 32, and base
+   16 encoding schemes.  It also discusses the use of line-feeds in
+   encoded data, use of padding in encoded data, use of non-alphabet
+   characters in encoded data, use of different encoding alphabets, and
+   canonical encodings.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Josefsson                   Standards Track                     [Page 1]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+Table of Contents
+
+   1. Introduction ....................................................3
+   2. Conventions Used in This Document ...............................3
+   3. Implementation Discrepancies ....................................3
+      3.1. Line Feeds in Encoded Data .................................3
+      3.2. Padding of Encoded Data ....................................4
+      3.3. Interpretation of Non-Alphabet Characters in Encoded Data ..4
+      3.4. Choosing the Alphabet ......................................4
+      3.5. Canonical Encoding .........................................5
+   4. Base 64 Encoding ................................................5
+   5. Base 64 Encoding with URL and Filename Safe Alphabet ............7
+   6. Base 32 Encoding ................................................8
+   7. Base 32 Encoding with Extended Hex Alphabet ....................10
+   8. Base 16 Encoding ...............................................10
+   9. Illustrations and Examples .....................................11
+   10. Test Vectors ..................................................12
+   11. ISO C99 Implementation of Base64 ..............................14
+   12. Security Considerations .......................................14
+   13. Changes Since RFC 3548 ........................................15
+   14. Acknowledgements ..............................................15
+   15. Copying Conditions ............................................15
+   16. References ....................................................16
+      16.1. Normative References .....................................16
+      16.2. Informative References ...................................16
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Josefsson                   Standards Track                     [Page 2]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+1.  Introduction
+
+   Base encoding of data is used in many situations to store or transfer
+   data in environments that, perhaps for legacy reasons, are restricted
+   to US-ASCII [1] data.  Base encoding can also be used in new
+   applications that do not have legacy restrictions, simply because it
+   makes it possible to manipulate objects with text editors.
+
+   In the past, different applications have had different requirements
+   and thus sometimes implemented base encodings in slightly different
+   ways.  Today, protocol specifications sometimes use base encodings in
+   general, and "base64" in particular, without a precise description or
+   reference.  Multipurpose Internet Mail Extensions (MIME) [4] is often
+   used as a reference for base64 without considering the consequences
+   for line-wrapping or non-alphabet characters.  The purpose of this
+   specification is to establish common alphabet and encoding
+   considerations.  This will hopefully reduce ambiguity in other
+   documents, leading to better interoperability.
+
+2.  Conventions Used in This Document
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in [2].
+
+3.  Implementation Discrepancies
+
+   Here we discuss the discrepancies between base encoding
+   implementations in the past and, where appropriate, mandate a
+   specific recommended behavior for the future.
+
+3.1.  Line Feeds in Encoded Data
+
+   MIME [4] is often used as a reference for base 64 encoding.  However,
+   MIME does not define "base 64" per se, but rather a "base 64 Content-
+   Transfer-Encoding" for use within MIME.  As such, MIME enforces a
+   limit on line length of base 64-encoded data to 76 characters.  MIME
+   inherits the encoding from Privacy Enhanced Mail (PEM) [3], stating
+   that it is "virtually identical"; however, PEM uses a line length of
+   64 characters.  The MIME and PEM limits are both due to limits within
+   SMTP.
+
+   Implementations MUST NOT add line feeds to base-encoded data unless
+   the specification referring to this document explicitly directs base
+   encoders to add line feeds after a specific number of characters.
+
+
+
+
+
+
+Josefsson                   Standards Track                     [Page 3]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+3.2.  Padding of Encoded Data
+
+   In some circumstances, the use of padding ("=") in base-encoded data
+   is not required or used.  In the general case, when assumptions about
+   the size of transported data cannot be made, padding is required to
+   yield correct decoded data.
+
+   Implementations MUST include appropriate pad characters at the end of
+   encoded data unless the specification referring to this document
+   explicitly states otherwise.
+
+   The base64 and base32 alphabets use padding, as described below in
+   sections 4 and 6, but the base16 alphabet does not need it; see
+   section 8.
+
+3.3.  Interpretation of Non-Alphabet Characters in Encoded Data
+
+   Base encodings use a specific, reduced alphabet to encode binary
+   data.  Non-alphabet characters could exist within base-encoded data,
+   caused by data corruption or by design.  Non-alphabet characters may
+   be exploited as a "covert channel", where non-protocol data can be
+   sent for nefarious purposes.  Non-alphabet characters might also be
+   sent in order to exploit implementation errors leading to, e.g.,
+   buffer overflow attacks.
+
+   Implementations MUST reject the encoded data if it contains
+   characters outside the base alphabet when interpreting base-encoded
+   data, unless the specification referring to this document explicitly
+   states otherwise.  Such specifications may instead state, as MIME
+   does, that characters outside the base encoding alphabet should
+   simply be ignored when interpreting data ("be liberal in what you
+   accept").  Note that this means that any adjacent carriage return/
+   line feed (CRLF) characters constitute "non-alphabet characters" and
+   are ignored.  Furthermore, such specifications MAY ignore the pad
+   character, "=", treating it as non-alphabet data, if it is present
+   before the end of the encoded data.  If more than the allowed number
+   of pad characters is found at the end of the string (e.g., a base 64
+   string terminated with "==="), the excess pad characters MAY also be
+   ignored.
+
+3.4.  Choosing the Alphabet
+
+   Different applications have different requirements on the characters
+   in the alphabet.  Here are a few requirements that determine which
+   alphabet should be used:
+
+
+
+
+
+
+Josefsson                   Standards Track                     [Page 4]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+   o  Handled by humans.  The characters "0" and "O" are easily
+      confused, as are "1", "l", and "I".  In the base32 alphabet below,
+      where 0 (zero) and 1 (one) are not present, a decoder may
+      interpret 0 as O, and 1 as I or L depending on case.  (However, by
+      default it should not; see previous section.)
+
+   o  Encoded into structures that mandate other requirements.  For base
+      16 and base 32, this determines the use of upper- or lowercase
+      alphabets.  For base 64, the non-alphanumeric characters (in
+      particular, "/") may be problematic in file names and URLs.
+
+   o  Used as identifiers.  Certain characters, notably "+" and "/" in
+      the base 64 alphabet, are treated as word-breaks by legacy text
+      search/index tools.
+
+   There is no universally accepted alphabet that fulfills all the
+   requirements.  For an example of a highly specialized variant, see
+   IMAP [8].  In this document, we document and name some currently used
+   alphabets.
+
+3.5.  Canonical Encoding
+
+   The padding step in base 64 and base 32 encoding can, if improperly
+   implemented, lead to non-significant alterations of the encoded data.
+   For example, if the input is only one octet for a base 64 encoding,
+   then all six bits of the first symbol are used, but only the first
+   two bits of the next symbol are used.  These pad bits MUST be set to
+   zero by conforming encoders, which is described in the descriptions
+   on padding below.  If this property do not hold, there is no
+   canonical representation of base-encoded data, and multiple base-
+   encoded strings can be decoded to the same binary data.  If this
+   property (and others discussed in this document) holds, a canonical
+   encoding is guaranteed.
+
+   In some environments, the alteration is critical and therefore
+   decoders MAY chose to reject an encoding if the pad bits have not
+   been set to zero.  The specification referring to this may mandate a
+   specific behaviour.
+
+4.  Base 64 Encoding
+
+   The following description of base 64 is derived from [3], [4], [5],
+   and [6].  This encoding may be referred to as "base64".
+
+   The Base 64 encoding is designed to represent arbitrary sequences of
+   octets in a form that allows the use of both upper- and lowercase
+   letters but that need not be human readable.
+
+
+
+
+Josefsson                   Standards Track                     [Page 5]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+   A 65-character subset of US-ASCII is used, enabling 6 bits to be
+   represented per printable character.  (The extra 65th character, "=",
+   is used to signify a special processing function.)
+
+   The encoding process represents 24-bit groups of input bits as output
+   strings of 4 encoded characters.  Proceeding from left to right, a
+   24-bit input group is formed by concatenating 3 8-bit input groups.
+   These 24 bits are then treated as 4 concatenated 6-bit groups, each
+   of which is translated into a single character in the base 64
+   alphabet.
+
+   Each 6-bit group is used as an index into an array of 64 printable
+   characters.  The character referenced by the index is placed in the
+   output string.
+
+                      Table 1: The Base 64 Alphabet
+
+     Value Encoding  Value Encoding  Value Encoding  Value Encoding
+         0 A            17 R            34 i            51 z
+         1 B            18 S            35 j            52 0
+         2 C            19 T            36 k            53 1
+         3 D            20 U            37 l            54 2
+         4 E            21 V            38 m            55 3
+         5 F            22 W            39 n            56 4
+         6 G            23 X            40 o            57 5
+         7 H            24 Y            41 p            58 6
+         8 I            25 Z            42 q            59 7
+         9 J            26 a            43 r            60 8
+        10 K            27 b            44 s            61 9
+        11 L            28 c            45 t            62 +
+        12 M            29 d            46 u            63 /
+        13 N            30 e            47 v
+        14 O            31 f            48 w         (pad) =
+        15 P            32 g            49 x
+        16 Q            33 h            50 y
+
+   Special processing is performed if fewer than 24 bits are available
+   at the end of the data being encoded.  A full encoding quantum is
+   always completed at the end of a quantity.  When fewer than 24 input
+   bits are available in an input group, bits with value zero are added
+   (on the right) to form an integral number of 6-bit groups.  Padding
+   at the end of the data is performed using the '=' character.  Since
+   all base 64 input is an integral number of octets, only the following
+   cases can arise:
+
+   (1) The final quantum of encoding input is an integral multiple of 24
+       bits; here, the final unit of encoded output will be an integral
+       multiple of 4 characters with no "=" padding.
+
+
+
+Josefsson                   Standards Track                     [Page 6]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+   (2) The final quantum of encoding input is exactly 8 bits; here, the
+       final unit of encoded output will be two characters followed by
+       two "=" padding characters.
+
+   (3) The final quantum of encoding input is exactly 16 bits; here, the
+       final unit of encoded output will be three characters followed by
+       one "=" padding character.
+
+5.  Base 64 Encoding with URL and Filename Safe Alphabet
+
+   The Base 64 encoding with an URL and filename safe alphabet has been
+   used in [12].
+
+   An alternative alphabet has been suggested that would use "~" as the
+   63rd character.  Since the "~" character has special meaning in some
+   file system environments, the encoding described in this section is
+   recommended instead.  The remaining unreserved URI character is ".",
+   but some file system environments do not permit multiple "." in a
+   filename, thus making the "." character unattractive as well.
+
+   The pad character "=" is typically percent-encoded when used in an
+   URI [9], but if the data length is known implicitly, this can be
+   avoided by skipping the padding; see section 3.2.
+
+   This encoding may be referred to as "base64url".  This encoding
+   should not be regarded as the same as the "base64" encoding and
+   should not be referred to as only "base64".  Unless clarified
+   otherwise, "base64" refers to the base 64 in the previous section.
+
+   This encoding is technically identical to the previous one, except
+   for the 62:nd and 63:rd alphabet character, as indicated in Table 2.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Josefsson                   Standards Track                     [Page 7]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+         Table 2: The "URL and Filename safe" Base 64 Alphabet
+
+     Value Encoding  Value Encoding  Value Encoding  Value Encoding
+         0 A            17 R            34 i            51 z
+         1 B            18 S            35 j            52 0
+         2 C            19 T            36 k            53 1
+         3 D            20 U            37 l            54 2
+         4 E            21 V            38 m            55 3
+         5 F            22 W            39 n            56 4
+         6 G            23 X            40 o            57 5
+         7 H            24 Y            41 p            58 6
+         8 I            25 Z            42 q            59 7
+         9 J            26 a            43 r            60 8
+        10 K            27 b            44 s            61 9
+        11 L            28 c            45 t            62 - (minus)
+        12 M            29 d            46 u            63 _
+        13 N            30 e            47 v           (underline)
+        14 O            31 f            48 w
+        15 P            32 g            49 x
+        16 Q            33 h            50 y         (pad) =
+
+6.  Base 32 Encoding
+
+   The following description of base 32 is derived from [11] (with
+   corrections).  This encoding may be referred to as "base32".
+
+   The Base 32 encoding is designed to represent arbitrary sequences of
+   octets in a form that needs to be case insensitive but that need not
+   be human readable.
+
+   A 33-character subset of US-ASCII is used, enabling 5 bits to be
+   represented per printable character.  (The extra 33rd character, "=",
+   is used to signify a special processing function.)
+
+   The encoding process represents 40-bit groups of input bits as output
+   strings of 8 encoded characters.  Proceeding from left to right, a
+   40-bit input group is formed by concatenating 5 8bit input groups.
+   These 40 bits are then treated as 8 concatenated 5-bit groups, each
+   of which is translated into a single character in the base 32
+   alphabet.  When a bit stream is encoded via the base 32 encoding, the
+   bit stream must be presumed to be ordered with the most-significant-
+   bit first.  That is, the first bit in the stream will be the high-
+   order bit in the first 8bit byte, the eighth bit will be the low-
+   order bit in the first 8bit byte, and so on.
+
+
+
+
+
+
+
+Josefsson                   Standards Track                     [Page 8]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+   Each 5-bit group is used as an index into an array of 32 printable
+   characters.  The character referenced by the index is placed in the
+   output string.  These characters, identified in Table 3, below, are
+   selected from US-ASCII digits and uppercase letters.
+
+                     Table 3: The Base 32 Alphabet
+
+     Value Encoding  Value Encoding  Value Encoding  Value Encoding
+         0 A             9 J            18 S            27 3
+         1 B            10 K            19 T            28 4
+         2 C            11 L            20 U            29 5
+         3 D            12 M            21 V            30 6
+         4 E            13 N            22 W            31 7
+         5 F            14 O            23 X
+         6 G            15 P            24 Y         (pad) =
+         7 H            16 Q            25 Z
+         8 I            17 R            26 2
+
+   Special processing is performed if fewer than 40 bits are available
+   at the end of the data being encoded.  A full encoding quantum is
+   always completed at the end of a body.  When fewer than 40 input bits
+   are available in an input group, bits with value zero are added (on
+   the right) to form an integral number of 5-bit groups.  Padding at
+   the end of the data is performed using the "=" character.  Since all
+   base 32 input is an integral number of octets, only the following
+   cases can arise:
+
+   (1) The final quantum of encoding input is an integral multiple of 40
+       bits; here, the final unit of encoded output will be an integral
+       multiple of 8 characters with no "=" padding.
+
+   (2) The final quantum of encoding input is exactly 8 bits; here, the
+       final unit of encoded output will be two characters followed by
+       six "=" padding characters.
+
+   (3) The final quantum of encoding input is exactly 16 bits; here, the
+       final unit of encoded output will be four characters followed by
+       four "=" padding characters.
+
+   (4) The final quantum of encoding input is exactly 24 bits; here, the
+       final unit of encoded output will be five characters followed by
+       three "=" padding characters.
+
+   (5) The final quantum of encoding input is exactly 32 bits; here, the
+       final unit of encoded output will be seven characters followed by
+       one "=" padding character.
+
+
+
+
+
+Josefsson                   Standards Track                     [Page 9]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+7.  Base 32 Encoding with Extended Hex Alphabet
+
+   The following description of base 32 is derived from [7].  This
+   encoding may be referred to as "base32hex".  This encoding should not
+   be regarded as the same as the "base32" encoding and should not be
+   referred to as only "base32".  This encoding is used by, e.g.,
+   NextSECure3 (NSEC3) [10].
+
+   One property with this alphabet, which the base64 and base32
+   alphabets lack, is that encoded data maintains its sort order when
+   the encoded data is compared bit-wise.
+
+   This encoding is identical to the previous one, except for the
+   alphabet.  The new alphabet is found in Table 4.
+
+                 Table 4: The "Extended Hex" Base 32 Alphabet
+
+         Value Encoding  Value Encoding  Value Encoding  Value Encoding
+             0 0             9 9            18 I            27 R
+             1 1            10 A            19 J            28 S
+             2 2            11 B            20 K            29 T
+             3 3            12 C            21 L            30 U
+             4 4            13 D            22 M            31 V
+             5 5            14 E            23 N
+             6 6            15 F            24 O         (pad) =
+             7 7            16 G            25 P
+             8 8            17 H            26 Q
+
+8.  Base 16 Encoding
+
+   The following description is original but analogous to previous
+   descriptions.  Essentially, Base 16 encoding is the standard case-
+   insensitive hex encoding and may be referred to as "base16" or "hex".
+
+   A 16-character subset of US-ASCII is used, enabling 4 bits to be
+   represented per printable character.
+
+   The encoding process represents 8-bit groups (octets) of input bits
+   as output strings of 2 encoded characters.  Proceeding from left to
+   right, an 8-bit input is taken from the input data.  These 8 bits are
+   then treated as 2 concatenated 4-bit groups, each of which is
+   translated into a single character in the base 16 alphabet.
+
+   Each 4-bit group is used as an index into an array of 16 printable
+   characters.  The character referenced by the index is placed in the
+   output string.
+
+
+
+
+
+Josefsson                   Standards Track                    [Page 10]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+                         Table 5: The Base 16 Alphabet
+
+         Value Encoding  Value Encoding  Value Encoding  Value Encoding
+             0 0             4 4             8 8            12 C
+             1 1             5 5             9 9            13 D
+             2 2             6 6            10 A            14 E
+             3 3             7 7            11 B            15 F
+
+   Unlike base 32 and base 64, no special padding is necessary since a
+   full code word is always available.
+
+9.  Illustrations and Examples
+
+   To translate between binary and a base encoding, the input is stored
+   in a structure, and the output is extracted.  The case for base 64 is
+   displayed in the following figure, borrowed from [5].
+
+            +--first octet--+-second octet--+--third octet--+
+            |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+            +-----------+---+-------+-------+---+-----------+
+            |5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0|
+            +--1.index--+--2.index--+--3.index--+--4.index--+
+
+   The case for base 32 is shown in the following figure, borrowed from
+   [7].  Each successive character in a base-32 value represents 5
+   successive bits of the underlying octet sequence.  Thus, each group
+   of 8 characters represents a sequence of 5 octets (40 bits).
+
+                        1          2          3
+             01234567 89012345 67890123 45678901 23456789
+            +--------+--------+--------+--------+--------+
+            |< 1 >< 2| >< 3 ><|.4 >< 5.|>< 6 ><.|7 >< 8 >|
+            +--------+--------+--------+--------+--------+
+                                                    <===> 8th character
+                                              <====> 7th character
+                                         <===> 6th character
+                                   <====> 5th character
+                             <====> 4th character
+                        <===> 3rd character
+                  <====> 2nd character
+             <===> 1st character
+
+
+
+
+
+
+
+
+
+
+Josefsson                   Standards Track                    [Page 11]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+   The following example of Base64 data is from [5], with corrections.
+
+      Input data:  0x14fb9c03d97e
+      Hex:     1   4    f   b    9   c     | 0   3    d   9    7   e
+      8-bit:   00010100 11111011 10011100  | 00000011 11011001 01111110
+      6-bit:   000101 001111 101110 011100 | 000000 111101 100101 111110
+      Decimal: 5      15     46     28       0      61     37     62
+      Output:  F      P      u      c        A      9      l      +
+
+      Input data:  0x14fb9c03d9
+      Hex:     1   4    f   b    9   c     | 0   3    d   9
+      8-bit:   00010100 11111011 10011100  | 00000011 11011001
+                                                      pad with 00
+      6-bit:   000101 001111 101110 011100 | 000000 111101 100100
+      Decimal: 5      15     46     28       0      61     36
+                                                         pad with =
+      Output:  F      P      u      c        A      9      k      =
+
+      Input data:  0x14fb9c03
+      Hex:     1   4    f   b    9   c     | 0   3
+      8-bit:   00010100 11111011 10011100  | 00000011
+                                             pad with 0000
+      6-bit:   000101 001111 101110 011100 | 000000 110000
+      Decimal: 5      15     46     28       0      48
+                                                  pad with =      =
+      Output:  F      P      u      c        A      w      =      =
+
+10.  Test Vectors
+
+   BASE64("") = ""
+
+   BASE64("f") = "Zg=="
+
+   BASE64("fo") = "Zm8="
+
+   BASE64("foo") = "Zm9v"
+
+   BASE64("foob") = "Zm9vYg=="
+
+   BASE64("fooba") = "Zm9vYmE="
+
+   BASE64("foobar") = "Zm9vYmFy"
+
+   BASE32("") = ""
+
+   BASE32("f") = "MY======"
+
+   BASE32("fo") = "MZXQ===="
+
+
+
+Josefsson                   Standards Track                    [Page 12]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+   BASE32("foo") = "MZXW6==="
+
+   BASE32("foob") = "MZXW6YQ="
+
+   BASE32("fooba") = "MZXW6YTB"
+
+   BASE32("foobar") = "MZXW6YTBOI======"
+
+   BASE32-HEX("") = ""
+
+   BASE32-HEX("f") = "CO======"
+
+   BASE32-HEX("fo") = "CPNG===="
+
+   BASE32-HEX("foo") = "CPNMU==="
+
+   BASE32-HEX("foob") = "CPNMUOG="
+
+   BASE32-HEX("fooba") = "CPNMUOJ1"
+
+   BASE32-HEX("foobar") = "CPNMUOJ1E8======"
+
+   BASE16("") = ""
+
+   BASE16("f") = "66"
+
+   BASE16("fo") = "666F"
+
+   BASE16("foo") = "666F6F"
+
+   BASE16("foob") = "666F6F62"
+
+   BASE16("fooba") = "666F6F6261"
+
+   BASE16("foobar") = "666F6F626172"
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Josefsson                   Standards Track                    [Page 13]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+11.  ISO C99 Implementation of Base64
+
+   An ISO C99 implementation of Base64 encoding and decoding that is
+   believed to follow all recommendations in this RFC is available from:
+
+      http://josefsson.org/base-encoding/
+
+   This code is not normative.
+
+   The code could not be included in this RFC for procedural reasons
+   (RFC 3978 section 5.4).
+
+12.  Security Considerations
+
+   When base encoding and decoding is implemented, care should be taken
+   not to introduce vulnerabilities to buffer overflow attacks, or other
+   attacks on the implementation.  A decoder should not break on invalid
+   input including, e.g., embedded NUL characters (ASCII 0).
+
+   If non-alphabet characters are ignored, instead of causing rejection
+   of the entire encoding (as recommended), a covert channel that can be
+   used to "leak" information is made possible.  The ignored characters
+   could also be used for other nefarious purposes, such as to avoid a
+   string equality comparison or to trigger implementation bugs.  The
+   implications of ignoring non-alphabet characters should be understood
+   in applications that do not follow the recommended practice.
+   Similarly, when the base 16 and base 32 alphabets are handled case
+   insensitively, alteration of case can be used to leak information or
+   make string equality comparisons fail.
+
+   When padding is used, there are some non-significant bits that
+   warrant security concerns, as they may be abused to leak information
+   or used to bypass string equality comparisons or to trigger
+   implementation problems.
+
+   Base encoding visually hides otherwise easily recognized information,
+   such as passwords, but does not provide any computational
+   confidentiality.  This has been known to cause security incidents
+   when, e.g., a user reports details of a network protocol exchange
+   (perhaps to illustrate some other problem) and accidentally reveals
+   the password because she is unaware that the base encoding does not
+   protect the password.
+
+   Base encoding adds no entropy to the plaintext, but it does increase
+   the amount of plaintext available and provide a signature for
+   cryptanalysis in the form of a characteristic probability
+   distribution.
+
+
+
+
+Josefsson                   Standards Track                    [Page 14]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+13.  Changes Since RFC 3548
+
+   Added the "base32 extended hex alphabet", needed to preserve sort
+   order of encoded data.
+
+   Referenced IMAP for the special Base64 encoding used there.
+
+   Fixed the example copied from RFC 2440.
+
+   Added security consideration about providing a signature for
+   cryptoanalysis.
+
+   Added test vectors.
+
+   Fixed typos.
+
+14.  Acknowledgements
+
+   Several people offered comments and/or suggestions, including John E.
+   Hadstate, Tony Hansen, Gordon Mohr, John Myers, Chris Newman, and
+   Andrew Sieber.  Text used in this document are based on earlier RFCs
+   describing specific uses of various base encodings.  The author
+   acknowledges the RSA Laboratories for supporting the work that led to
+   this document.
+
+   This revised version is based in parts on comments and/or suggestions
+   made by Roy Arends, Eric Blake, Brian E Carpenter, Elwyn Davies, Bill
+   Fenner, Sam Hartman, Ted Hardie, Per Hygum, Jelte Jansen, Clement
+   Kent, Tero Kivinen, Paul Kwiatkowski, and Ben Laurie.
+
+15.  Copying Conditions
+
+   Copyright (c) 2000-2006 Simon Josefsson
+
+   Regarding the abstract and sections 1, 3, 8, 10, 12, 13, and 14 of
+   this document, that were written by Simon Josefsson ("the author",
+   for the remainder of this section), the author makes no guarantees
+   and is not responsible for any damage resulting from its use.  The
+   author grants irrevocable permission to anyone to use, modify, and
+   distribute it in any way that does not diminish the rights of anyone
+   else to use, modify, and distribute it, provided that redistributed
+   derivative works do not contain misleading author or version
+   information and do not falsely purport to be IETF RFC documents.
+   Derivative works need not be licensed under similar terms.
+
+
+
+
+
+
+
+Josefsson                   Standards Track                    [Page 15]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+16.  References
+
+16.1.  Normative References
+
+   [1]   Cerf, V., "ASCII format for network interchange", RFC 20,
+         October 1969.
+
+   [2]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
+         Levels", BCP 14, RFC 2119, March 1997.
+
+16.2.  Informative References
+
+   [3]   Linn, J., "Privacy Enhancement for Internet Electronic Mail:
+         Part I: Message Encryption and Authentication Procedures", RFC
+         1421, February 1993.
+
+   [4]   Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+         Extensions (MIME) Part One: Format of Internet Message Bodies",
+         RFC 2045, November 1996.
+
+   [5]   Callas, J., Donnerhacke, L., Finney, H., and R. Thayer,
+         "OpenPGP Message Format", RFC 2440, November 1998.
+
+   [6]   Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose,
+         "DNS Security Introduction and Requirements", RFC 4033, March
+         2005.
+
+   [7]   Klyne, G. and L. Masinter, "Identifying Composite Media
+         Features", RFC 2938, September 2000.
+
+   [8]   Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
+         4rev1", RFC 3501, March 2003.
+
+   [9]   Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
+         Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
+         January 2005.
+
+   [10]  Laurie, B., Sisson, G., Arends, R., and D. Blacka, "DNSSEC Hash
+         Authenticated Denial of Existence", Work in Progress, June
+         2006.
+
+   [11]  Myers, J., "SASL GSSAPI mechanisms", Work in Progress, May
+         2000.
+
+   [12]  Wilcox-O'Hearn, B., "Post to P2P-hackers mailing list",
+         http://zgp.org/pipermail/p2p-hackers/2001-September/
+         000315.html, September 2001.
+
+
+
+
+Josefsson                   Standards Track                    [Page 16]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+Author's Address
+
+   Simon Josefsson
+   SJD
+   EMail: simon@josefsson.org
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Josefsson                   Standards Track                    [Page 17]
+
+RFC 4648                    Base-N Encodings                October 2006
+
+
+Full Copyright Statement
+
+   Copyright (C) The Internet Society (2006).
+
+   This document is subject to the rights, licenses and restrictions
+   contained in BCP 78, and except as set forth therein, the authors
+   retain all their rights.
+
+   This document and the information contained herein are provided on an
+   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
+   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
+   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
+   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+   The IETF takes no position regarding the validity or scope of any
+   Intellectual Property Rights or other rights that might be claimed to
+   pertain to the implementation or use of the technology described in
+   this document or the extent to which any license under such rights
+   might or might not be available; nor does it represent that it has
+   made any independent effort to identify any such rights.  Information
+   on the procedures with respect to rights in RFC documents can be
+   found in BCP 78 and BCP 79.
+
+   Copies of IPR disclosures made to the IETF Secretariat and any
+   assurances of licenses to be made available, or the result of an
+   attempt made to obtain a general license or permission for the use of
+   such proprietary rights by implementers or users of this
+   specification can be obtained from the IETF on-line IPR repository at
+   http://www.ietf.org/ipr.
+
+   The IETF invites any interested party to bring to its attention any
+   copyrights, patents or patent applications, or other proprietary
+   rights that may cover technology that may be required to implement
+   this standard.  Please address the information to the IETF at
+   ietf-ipr@ietf.org.
+
+Acknowledgement
+
+   Funding for the RFC Editor function is provided by the IETF
+   Administrative Support Activity (IASA).
+
+
+
+
+
+
+
+Josefsson                   Standards Track                    [Page 18]
+