draft-josefsson-rfc3548bis-02.txt | draft-josefsson-rfc3548bis-03.txt | |||
---|---|---|---|---|
Network Working Group S. Josefsson | Network Working Group S. Josefsson | |||
Internet-Draft SJD | ||||
Obsoletes: 3548 (if approved) | Obsoletes: 3548 (if approved) May 3, 2006 | |||
Expires: September 25, 2006 | Expires: November 4, 2006 | |||
The Base16, Base32, and Base64 Data Encodings | The Base16, Base32, and Base64 Data Encodings | |||
draft-josefsson-rfc3548bis-02 | draft-josefsson-rfc3548bis-03 | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 34 | skipping to change at page 1, line 34 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on September 25, 2006. | This Internet-Draft will expire on November 4, 2006. | |||
Copyright Notice | Copyright Notice | |||
Copyright (C) The Internet Society (2006). | Copyright (C) The Internet Society (2006). | |||
Keywords | Keywords | |||
Base Encoding, Base64, Base32, Base16, Hex. | Base Encoding, Base64, Base32, Base16, Hex. | |||
Abstract | Abstract | |||
This document describes the commonly used base 64, base 32, and base | This document describes the commonly used base 64, base 32, and base | |||
16 encoding schemes. It also discusses the use of line-feeds in | 16 encoding schemes. It also discusses the use of line-feeds in | |||
encoded data, use of padding in encoded data, use of non-alphabet | encoded data, use of padding in encoded data, use of non-alphabet | |||
characters in encoded data, and use of different encoding alphabets. | characters in encoded data, and use of different encoding alphabets. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Conventions Used in this Document . . . . . . . . . . . . . . 3 | 2. Conventions Used in this Document . . . . . . . . . . . . . . 3 | |||
3. Implementation discrepancies . . . . . . . . . . . . . . . . . 3 | 3. Implementation Discrepancies . . . . . . . . . . . . . . . . . 3 | |||
3.1. Line feeds in encoded data . . . . . . . . . . . . . . . . 3 | 3.1. Line Feeds In Encoded Data . . . . . . . . . . . . . . . . 3 | |||
3.2. Padding of encoded data . . . . . . . . . . . . . . . . . 4 | 3.2. Padding Of Encoded Data . . . . . . . . . . . . . . . . . 4 | |||
3.3. Interpretation of non-alphabet characters in encoded | 3.3. Interpretation Of Non-Alphabet Characters In Encoded | |||
data . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | data . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
3.4. Choosing the alphabet . . . . . . . . . . . . . . . . . . 4 | 3.4. Choosing The Alphabet . . . . . . . . . . . . . . . . . . 4 | |||
4. Base 64 Encoding . . . . . . . . . . . . . . . . . . . . . . . 5 | 4. Base 64 Encoding . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
5. Base 64 Encoding with URL and Filename Safe Alphabet . . . . . 7 | 5. Base 64 Encoding With URL And Filename Safe Alphabet . . . . . 8 | |||
6. Base 32 Encoding . . . . . . . . . . . . . . . . . . . . . . . 7 | 6. Base 32 Encoding . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
7. Base 32 Encoding with Extended Hex Alphabet . . . . . . . . . 9 | 7. Base 32 Encoding With Extended Hex Alphabet . . . . . . . . . 10 | |||
8. Base 16 Encoding . . . . . . . . . . . . . . . . . . . . . . . 10 | 8. Base 16 Encoding . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
9. Illustrations and examples . . . . . . . . . . . . . . . . . . 11 | 9. Illustrations And Examples . . . . . . . . . . . . . . . . . . 12 | |||
10. Test vectors . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 10. Test Vectors . . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
11. ISO C99 Implementation of Base64 . . . . . . . . . . . . . . . 13 | 11. ISO C99 Implementation Of Base64 . . . . . . . . . . . . . . . 14 | |||
11.1. Prototypes: base64.h . . . . . . . . . . . . . . . . . . . 13 | 11.1. Prototypes: base64.h . . . . . . . . . . . . . . . . . . . 14 | |||
11.2. Implementation: base64.c . . . . . . . . . . . . . . . . . 15 | 11.2. Implementation: base64.c . . . . . . . . . . . . . . . . . 16 | |||
12. Security Considerations . . . . . . . . . . . . . . . . . . . 24 | 12. Security Considerations . . . . . . . . . . . . . . . . . . . 25 | |||
13. Changes since RFC 3548 . . . . . . . . . . . . . . . . . . . . 24 | 13. Changes Since RFC 3548 . . . . . . . . . . . . . . . . . . . . 25 | |||
14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 24 | 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26 | |||
15. Copying conditions . . . . . . . . . . . . . . . . . . . . . . 25 | 15. Copying Conditions . . . . . . . . . . . . . . . . . . . . . . 26 | |||
16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 | 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 | |||
16.1. Normative References . . . . . . . . . . . . . . . . . . . 25 | 16.1. Normative References . . . . . . . . . . . . . . . . . . . 26 | |||
16.2. Informative References . . . . . . . . . . . . . . . . . . 25 | 16.2. Informative References . . . . . . . . . . . . . . . . . . 26 | |||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 27 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
Intellectual Property and Copyright Statements . . . . . . . . . . 28 | Intellectual Property and Copyright Statements . . . . . . . . . . 29 | |||
1. Introduction | 1. Introduction | |||
Base encoding of data is used in many situations to store or transfer | Base encoding of data is used in many situations to store or transfer | |||
data in environments that, perhaps for legacy reasons, are restricted | data in environments that, perhaps for legacy reasons, are restricted | |||
to only US-ASCII [2] data. Base encoding can also be used in new | to only US-ASCII [1] data. Base encoding can also be used in new | |||
applications that do not have legacy restrictions, simply because it | applications that do not have legacy restrictions, simply because it | |||
makes it possible to manipulate objects with text editors. | makes it possible to manipulate objects with text editors. | |||
In the past, different applications have had different requirements | In the past, different applications have had different requirements | |||
and thus sometimes implemented base encodings in slightly different | and thus sometimes implemented base encodings in slightly different | |||
ways. Today, protocol specifications sometimes use base encodings in | ways. Today, protocol specifications sometimes use base encodings in | |||
general, and "base64" in particular, without a precise description or | general, and "base64" in particular, without a precise description or | |||
reference. MIME [4] is often used as a reference for base64 without | reference. Multipurpose Internet Mail Extensions (MIME) [4] is often | |||
considering the consequences for line-wrapping or non-alphabet | used as a reference for base64 without considering the consequences | |||
characters. The purpose of this specification is to establish common | for line-wrapping or non-alphabet characters. The purpose of this | |||
alphabet and encoding considerations. This will hopefully reduce | specification is to establish common alphabet and encoding | |||
ambiguity in other documents, leading to better interoperability. | considerations. This will hopefully reduce ambiguity in other | |||
documents, leading to better interoperability. | ||||
2. Conventions Used in this Document | 2. Conventions Used in this Document | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in [1]. | document are to be interpreted as described in [2]. | |||
3. Implementation discrepancies | 3. Implementation Discrepancies | |||
Here we discuss the discrepancies between base encoding | Here we discuss the discrepancies between base encoding | |||
implementations in the past, and where appropriate, mandate a | implementations in the past, and where appropriate, mandate a | |||
specific recommended behavior for the future. | specific recommended behavior for the future. | |||
3.1. Line feeds in encoded data | 3.1. Line Feeds In Encoded Data | |||
MIME [4] is often used as a reference for base 64 encoding. However, | MIME [4] is often used as a reference for base 64 encoding. However, | |||
MIME does not define "base 64" per se, but rather a "base 64 Content- | MIME does not define "base 64" per se, but rather a "base 64 Content- | |||
Transfer-Encoding" for use within MIME. As such, MIME enforces a | Transfer-Encoding" for use within MIME. As such, MIME enforces a | |||
limit on line length of base 64 encoded data to 76 characters. MIME | limit on line length of base 64 encoded data to 76 characters. MIME | |||
inherits the encoding from PEM [3] stating it is "virtually | inherits the encoding from Privacy Enhanced Mail (PEM) [3] stating it | |||
identical", however PEM uses a line length of 64 characters. The | is "virtually identical", however PEM uses a line length of 64 | |||
MIME and PEM limits are both due to limits within SMTP. | characters. The MIME and PEM limits are both due to limits within | |||
SMTP. | ||||
Implementations MUST NOT add line feeds to base encoded data unless | Implementations MUST NOT add line feeds to base encoded data unless | |||
the specification referring to this document explicitly directs base | the specification referring to this document explicitly directs base | |||
encoders to add line feeds after a specific number of characters. | encoders to add line feeds after a specific number of characters. | |||
3.2. Padding of encoded data | 3.2. Padding Of Encoded Data | |||
In some circumstances, the use of padding ("=") in base encoded data | In some circumstances, the use of padding ("=") in base encoded data | |||
is not required nor used. In the general case, when assumptions on | is not required nor used. In the general case, when assumptions on | |||
size of transported data cannot be made, padding is required to yield | size of transported data cannot be made, padding is required to yield | |||
correct decoded data. | correct decoded data. | |||
Implementations MUST include appropriate pad characters at the end of | Implementations MUST include appropriate pad characters at the end of | |||
encoded data unless the specification referring to this document | encoded data unless the specification referring to this document | |||
explicitly states otherwise. | explicitly states otherwise. | |||
3.3. Interpretation of non-alphabet characters in encoded data | The base64 and base32 alphabets use padding, as described below in | |||
section 4 and 6, but the base16 alphabet does not need it, see | ||||
section 8. | ||||
3.3. Interpretation Of Non-Alphabet Characters In Encoded data | ||||
Base encodings use a specific, reduced, alphabet to encode binary | Base encodings use a specific, reduced, alphabet to encode binary | |||
data. Non alphabet characters could exist within base encoded data, | data. Non-alphabet characters could exist within base encoded data, | |||
caused by data corruption or by design. Non alphabet characters may | caused by data corruption or by design. Non-alphabet characters may | |||
be exploited as a "covert channel", where non-protocol data can be | be exploited as a "covert channel", where non-protocol data can be | |||
sent for nefarious purposes. Non alphabet characters might also be | sent for nefarious purposes. Non-alphabet characters might also be | |||
sent in order to exploit implementation errors leading to, e.g., | sent in order to exploit implementation errors leading to, e.g., | |||
buffer overflow attacks. | buffer overflow attacks. | |||
Implementations MUST reject the encoding if it contains characters | Implementations MUST reject the encoded data if it contains | |||
outside the base alphabet when interpreting base encoded data, unless | characters outside the base alphabet when interpreting base encoded | |||
the specification referring to this document explicitly states | data, unless the specification referring to this document explicitly | |||
otherwise. Such specifications may, as MIME does, instead state that | states otherwise. Such specifications may, as MIME does, instead | |||
characters outside the base encoding alphabet should simply be | state that characters outside the base encoding alphabet should | |||
ignored when interpreting data ("be liberal in what you accept"). | simply be ignored when interpreting data ("be liberal in what you | |||
Note that this means that any CRLF constitute "non alphabet | accept"). Note that this means that any adjacent carriage return/ | |||
characters" and are ignored. Furthermore, such specifications may | line feed (CRLF) characters constitute "non-alphabet characters" and | |||
consider the pad character, "=", as not part of the base alphabet | are ignored. Furthermore, such specifications MAY ignore the pad | |||
until the end of the string. If more than the allowed number of pad | character, "=", treating it as non-alphabet data, if it is present | |||
characters are found at the end of the string, e.g., a base 64 string | before the end of the encoded data. If more than the allowed number | |||
terminated with "===", the excess pad characters could be ignored. | of pad characters are found at the end of the string, e.g., a base 64 | |||
string terminated with "===", the excess pad characters MAY also be | ||||
ignored. | ||||
3.4. Choosing the alphabet | 3.4. Choosing The Alphabet | |||
Different applications have different requirements on the characters | Different applications have different requirements on the characters | |||
in the alphabet. Here are a few requirements that determine which | in the alphabet. Here are a few requirements that determine which | |||
alphabet should be used: | alphabet should be used: | |||
o Handled by humans. Characters "0", "O" are easily interchanged, | o Handled by humans. Characters "0", "O" are easily confused, as | |||
as well "1", "l" and "I". In the base32 alphabet below, where 0 | well as "1", "l" and "I". In the base32 alphabet below, where 0 | |||
(zero) and 1 (one) is not present, a decoder may interpret 0 as O, | (zero) and 1 (one) are not present, a decoder may interpret 0 as | |||
and 1 as I or L depending on case. (However, by default it should | O, and 1 as I or L depending on case. (However, by default it | |||
not, see previous section.) | should not, see previous section.) | |||
o Encoded into structures that place other requirements. For base | o Encoded into structures that mandate other requirements. For base | |||
16 and base 32, this determines the use of upper- or lowercase | 16 and base 32, this determines the use of upper- or lowercase | |||
alphabets. For base 64, the non-alphanumeric characters (in | alphabets. For base 64, the non-alphanumeric characters (in | |||
particular "/") may be problematic in file names and URLs. | particular "/") may be problematic in file names and URLs. | |||
o Used as identifiers. Certain characters, notably "+" and "/" in | o Used as identifiers. Certain characters, notably "+" and "/" in | |||
the base 64 alphabet, are treated as word-breaks by legacy text | the base 64 alphabet, are treated as word-breaks by legacy text | |||
search/index tools. | search/index tools. | |||
There is no universally accepted alphabet that fulfills all the | There is no universally accepted alphabet that fulfills all the | |||
requirements. For an example of a highly specialized variant, see | requirements. For an example of a highly specialized variant, see | |||
IMAP [8]. In this document, we document and name some currently used | IMAP [8]. In this document, we document and name some currently used | |||
alphabets. | alphabets. | |||
4. Base 64 Encoding | 4. Base 64 Encoding | |||
The following description of base 64 is due to [3], [4], [5] and [6]. | The following description of base 64 is derived from [3], [4], [5] | |||
and [6]. This encoding may be referred to as "base64". | ||||
The Base 64 encoding is designed to represent arbitrary sequences of | The Base 64 encoding is designed to represent arbitrary sequences of | |||
octets in a form that requires case sensitivity but need not be | octets in a form that allows the use of both upper- and lowercase | |||
humanly readable. | letters but need not be humanly readable. | |||
A 65-character subset of US-ASCII is used, enabling 6 bits to be | A 65-character subset of US-ASCII is used, enabling 6 bits to be | |||
represented per printable character. (The extra 65th character, "=", | represented per printable character. (The extra 65th character, "=", | |||
is used to signify a special processing function.) | is used to signify a special processing function.) | |||
The encoding process represents 24-bit groups of input bits as output | The encoding process represents 24-bit groups of input bits as output | |||
strings of 4 encoded characters. Proceeding from left to right, a | strings of 4 encoded characters. Proceeding from left to right, a | |||
24-bit input group is formed by concatenating 3 8-bit input groups. | 24-bit input group is formed by concatenating 3 8-bit input groups. | |||
These 24 bits are then treated as 4 concatenated 6-bit groups, each | These 24 bits are then treated as 4 concatenated 6-bit groups, each | |||
of which is translated into a single digit in the base 64 alphabet. | of which is translated into a single character in the base 64 | |||
alphabet. | ||||
Each 6-bit group is used as an index into an array of 64 printable | Each 6-bit group is used as an index into an array of 64 printable | |||
characters. The character referenced by the index is placed in the | characters. The character referenced by the index is placed in the | |||
output string. | output string. | |||
Table 1: The Base 64 Alphabet | Table 1: The Base 64 Alphabet | |||
Value Encoding Value Encoding Value Encoding Value Encoding | Value Encoding Value Encoding Value Encoding Value Encoding | |||
0 A 17 R 34 i 51 z | 0 A 17 R 34 i 51 z | |||
1 B 18 S 35 j 52 0 | 1 B 18 S 35 j 52 0 | |||
skipping to change at page 7, line 5 | skipping to change at page 8, line 5 | |||
multiple of 4 characters with no "=" padding, | multiple of 4 characters with no "=" padding, | |||
(2) the final quantum of encoding input is exactly 8 bits; here, the | (2) the final quantum of encoding input is exactly 8 bits; here, the | |||
final unit of encoded output will be two characters followed by two | final unit of encoded output will be two characters followed by two | |||
"=" padding characters, or | "=" padding characters, or | |||
(3) the final quantum of encoding input is exactly 16 bits; here, the | (3) the final quantum of encoding input is exactly 16 bits; here, the | |||
final unit of encoded output will be three characters followed by one | final unit of encoded output will be three characters followed by one | |||
"=" padding character. | "=" padding character. | |||
5. Base 64 Encoding with URL and Filename Safe Alphabet | 5. Base 64 Encoding With URL And Filename Safe Alphabet | |||
The Base 64 encoding with an URL and filename safe alphabet has been | The Base 64 encoding with an URL and filename safe alphabet has been | |||
used in [10]. | used in [11]. | |||
An alternative alphabet has been suggested that used "~" as the 63rd | An alternative alphabet has been suggested that used "~" as the 63rd | |||
character. Since the "~" character has special meaning in some file | character. Since the "~" character has special meaning in some file | |||
system environments, the encoding described in this section is | system environments, the encoding described in this section is | |||
recommended instead. | recommended instead. | |||
This encoding should not be regarded as the same as the "base64" | This encoding may be referred to as "base64url". This encoding | |||
encoding, and should not be referred to as only "base64". Unless | should not be regarded as the same as the "base64" encoding, and | |||
made clear, "base64" refer to the base 64 in the previous section. | should not be referred to as only "base64". Unless made clear, | |||
"base64" refer to the base 64 in the previous section. | ||||
This encoding is technically identical to the previous one, except | This encoding is technically identical to the previous one, except | |||
for the 62:nd and 63:rd alphabet character, as indicated in table 2. | for the 62:nd and 63:rd alphabet character, as indicated in table 2. | |||
Table 2: The "URL and Filename safe" Base 64 Alphabet | Table 2: The "URL and Filename safe" Base 64 Alphabet | |||
Value Encoding Value Encoding Value Encoding Value Encoding | Value Encoding Value Encoding Value Encoding Value Encoding | |||
0 A 17 R 34 i 51 z | 0 A 17 R 34 i 51 z | |||
1 B 18 S 35 j 52 0 | 1 B 18 S 35 j 52 0 | |||
2 C 19 T 36 k 53 1 | 2 C 19 T 36 k 53 1 | |||
3 D 20 U 37 l 54 2 | 3 D 20 U 37 l 54 2 | |||
4 E 21 V 38 m 55 3 | 4 E 21 V 38 m 55 3 | |||
5 F 22 W 39 n 56 4 | 5 F 22 W 39 n 56 4 | |||
6 G 23 X 40 o 57 5 | 6 G 23 X 40 o 57 5 | |||
7 H 24 Y 41 p 58 6 | 7 H 24 Y 41 p 58 6 | |||
8 I 25 Z 42 q 59 7 | 8 I 25 Z 42 q 59 7 | |||
9 J 26 a 43 r 60 8 | 9 J 26 a 43 r 60 8 | |||
10 K 27 b 44 s 61 9 | 10 K 27 b 44 s 61 9 | |||
11 L 28 c 45 t 62 - (minus) | 11 L 28 c 45 t 62 - (minus) | |||
12 M 29 d 46 u 63 _ | 12 M 29 d 46 u 63 _ | |||
13 N 30 e 47 v (understrike) | 13 N 30 e 47 v (underline) | |||
14 O 31 f 48 w | 14 O 31 f 48 w | |||
15 P 32 g 49 x | 15 P 32 g 49 x | |||
16 Q 33 h 50 y (pad) = | 16 Q 33 h 50 y (pad) = | |||
6. Base 32 Encoding | 6. Base 32 Encoding | |||
The following description of base 32 is due to [9] (with | The following description of base 32 is derived from [10] (with | |||
corrections). | corrections). This encoding may be referred to as "base32". | |||
The Base 32 encoding is designed to represent arbitrary sequences of | The Base 32 encoding is designed to represent arbitrary sequences of | |||
octets in a form that needs to be case insensitive but need not be | octets in a form that needs to be case insensitive but need not be | |||
humanly readable. | humanly readable. | |||
A 33-character subset of US-ASCII is used, enabling 5 bits to be | A 33-character subset of US-ASCII is used, enabling 5 bits to be | |||
represented per printable character. (The extra 33rd character, "=", | represented per printable character. (The extra 33rd character, "=", | |||
is used to signify a special processing function.) | is used to signify a special processing function.) | |||
The encoding process represents 40-bit groups of input bits as output | The encoding process represents 40-bit groups of input bits as output | |||
strings of 8 encoded characters. Proceeding from left to right, a | strings of 8 encoded characters. Proceeding from left to right, a | |||
40-bit input group is formed by concatenating 5 8bit input groups. | 40-bit input group is formed by concatenating 5 8bit input groups. | |||
These 40 bits are then treated as 8 concatenated 5-bit groups, each | These 40 bits are then treated as 8 concatenated 5-bit groups, each | |||
of which is translated into a single digit in the base 32 alphabet. | of which is translated into a single character in the base 32 | |||
When encoding a bit stream via the base 32 encoding, the bit stream | alphabet. When encoding a bit stream via the base 32 encoding, the | |||
must be presumed to be ordered with the most-significant-bit first. | bit stream must be presumed to be ordered with the most-significant- | |||
That is, the first bit in the stream will be the high-order bit in | bit first. That is, the first bit in the stream will be the high- | |||
the first 8bit byte, and the eighth bit will be the low-order bit in | order bit in the first 8bit byte, and the eighth bit will be the low- | |||
the first 8bit byte, and so on. | order bit in the first 8bit byte, and so on. | |||
Each 5-bit group is used as an index into an array of 32 printable | Each 5-bit group is used as an index into an array of 32 printable | |||
characters. The character referenced by the index is placed in the | characters. The character referenced by the index is placed in the | |||
output string. These characters, identified in Table 3, below, are | output string. These characters, identified in Table 3, below, are | |||
selected from US-ASCII digits and uppercase letters. | selected from US-ASCII digits and uppercase letters. | |||
Table 3: The Base 32 Alphabet | Table 3: The Base 32 Alphabet | |||
Value Encoding Value Encoding Value Encoding Value Encoding | Value Encoding Value Encoding Value Encoding Value Encoding | |||
0 A 9 J 18 S 27 3 | 0 A 9 J 18 S 27 3 | |||
skipping to change at page 9, line 18 | skipping to change at page 10, line 18 | |||
"=" padding characters, | "=" padding characters, | |||
(4) the final quantum of encoding input is exactly 24 bits; here, the | (4) the final quantum of encoding input is exactly 24 bits; here, the | |||
final unit of encoded output will be five characters followed by | final unit of encoded output will be five characters followed by | |||
three "=" padding characters, or | three "=" padding characters, or | |||
(5) the final quantum of encoding input is exactly 32 bits; here, the | (5) the final quantum of encoding input is exactly 32 bits; here, the | |||
final unit of encoded output will be seven characters followed by one | final unit of encoded output will be seven characters followed by one | |||
"=" padding character. | "=" padding character. | |||
7. Base 32 Encoding with Extended Hex Alphabet | 7. Base 32 Encoding With Extended Hex Alphabet | |||
The following description of base 32 is due to [7]. This encoding | The following description of base 32 is derived from [7]. This | |||
should not be regarded as the same as the "base32" encoding, and | encoding may be referred to as "base32hex". This encoding should not | |||
should not be referred to as only "base32". | be regarded as the same as the "base32" encoding, and should not be | |||
referred to as only "base32". This encoding is used by, e.g., NSEC3 | ||||
[9] | ||||
One property with this alphabet, that the base64 and base32 alphabet | One property with this alphabet, that the base64 and base32 alphabet | |||
lack, is that encoded data maintain its sort order when the encoded | lack, is that encoded data maintain its sort order when the encoded | |||
data is compared bit-wise. | data is compared bit-wise. | |||
This encoding is identical to the previous one, except for the | This encoding is identical to the previous one, except for the | |||
alphabet. The new alphabet is found in table 4. | alphabet. The new alphabet is found in table 4. | |||
Table 4: The "Extended Hex" Base 32 Alphabet | Table 4: The "Extended Hex" Base 32 Alphabet | |||
skipping to change at page 10, line 19 | skipping to change at page 11, line 19 | |||
insensitive hex encoding, and may be referred to as "base16" or | insensitive hex encoding, and may be referred to as "base16" or | |||
"hex". | "hex". | |||
A 16-character subset of US-ASCII is used, enabling 4 bits to be | A 16-character subset of US-ASCII is used, enabling 4 bits to be | |||
represented per printable character. | represented per printable character. | |||
The encoding process represents 8-bit groups (octets) of input bits | The encoding process represents 8-bit groups (octets) of input bits | |||
as output strings of 2 encoded characters. Proceeding from left to | as output strings of 2 encoded characters. Proceeding from left to | |||
right, a 8-bit input is taken from the input data. These 8 bits are | right, a 8-bit input is taken from the input data. These 8 bits are | |||
then treated as 2 concatenated 4-bit groups, each of which is | then treated as 2 concatenated 4-bit groups, each of which is | |||
translated into a single digit in the base 16 alphabet. | translated into a single character in the base 16 alphabet. | |||
Each 4-bit group is used as an index into an array of 16 printable | Each 4-bit group is used as an index into an array of 16 printable | |||
characters. The character referenced by the index is placed in the | characters. The character referenced by the index is placed in the | |||
output string. | output string. | |||
Table 5: The Base 16 Alphabet | Table 5: The Base 16 Alphabet | |||
Value Encoding Value Encoding Value Encoding Value Encoding | Value Encoding Value Encoding Value Encoding Value Encoding | |||
0 0 4 4 8 8 12 C | 0 0 4 4 8 8 12 C | |||
1 1 5 5 9 9 13 D | 1 1 5 5 9 9 13 D | |||
2 2 6 6 10 A 14 E | 2 2 6 6 10 A 14 E | |||
3 3 7 7 11 B 15 F | 3 3 7 7 11 B 15 F | |||
Unlike base 32 and base 64, no special padding is necessary since a | Unlike base 32 and base 64, no special padding is necessary since a | |||
full code word is always available. | full code word is always available. | |||
9. Illustrations and examples | 9. Illustrations And Examples | |||
To translate between binary and a base encoding, the input is stored | To translate between binary and a base encoding, the input is stored | |||
in a structure and the output is extracted. The case for base 64 is | in a structure and the output is extracted. The case for base 64 is | |||
displayed in the following figure, borrowed from [5]. | displayed in the following figure, borrowed from [5]. | |||
+--first octet--+-second octet--+--third octet--+ | +--first octet--+-second octet--+--third octet--+ | |||
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| | |||
+-----------+---+-------+-------+---+-----------+ | +-----------+---+-------+-------+---+-----------+ | |||
|5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0| | |5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0| | |||
+--1.index--+--2.index--+--3.index--+--4.index--+ | +--1.index--+--2.index--+--3.index--+--4.index--+ | |||
skipping to change at page 12, line 30 | skipping to change at page 13, line 30 | |||
Input data: 0x14fb9c03 | Input data: 0x14fb9c03 | |||
Hex: 1 4 f b 9 c | 0 3 | Hex: 1 4 f b 9 c | 0 3 | |||
8-bit: 00010100 11111011 10011100 | 00000011 | 8-bit: 00010100 11111011 10011100 | 00000011 | |||
pad with 0000 | pad with 0000 | |||
6-bit: 000101 001111 101110 011100 | 000000 110000 | 6-bit: 000101 001111 101110 011100 | 000000 110000 | |||
Decimal: 5 15 46 28 0 48 | Decimal: 5 15 46 28 0 48 | |||
pad with = = | pad with = = | |||
Output: F P u c A w = = | Output: F P u c A w = = | |||
10. Test vectors | 10. Test Vectors | |||
BASE64("") = "" | BASE64("") = "" | |||
BASE64("f") = "Zg==" | BASE64("f") = "Zg==" | |||
BASE64("fo") = "Zm8=" | BASE64("fo") = "Zm8=" | |||
BASE64("foo") = "Zm9v" | BASE64("foo") = "Zm9v" | |||
BASE64("foob") = "Zm9vYg==" | BASE64("foob") = "Zm9vYg==" | |||
skipping to change at page 13, line 28 | skipping to change at page 14, line 28 | |||
BASE32-HEX("foo") = "CPNMU===" | BASE32-HEX("foo") = "CPNMU===" | |||
BASE32-HEX("foob") = "CPNMUOG=" | BASE32-HEX("foob") = "CPNMUOG=" | |||
BASE32-HEX("fooba") = "CPNMUOJ1" | BASE32-HEX("fooba") = "CPNMUOJ1" | |||
BASE32-HEX("foobar") = "CPNMUOJ1E8======" | BASE32-HEX("foobar") = "CPNMUOJ1E8======" | |||
BASE16("") = "" | BASE16("") = "" | |||
BASE16("f") = "GG" | BASE16("f") = "66" | |||
BASE16("fo") = "GGGP" | BASE16("fo") = "666F" | |||
BASE16("foo") = "GGGPGP" | BASE16("foo") = "666F6F" | |||
BASE16("foob") = "GGGPGPGC" | BASE16("foob") = "666F6F62" | |||
BASE16("fooba") = "GGGPGPGCGB" | BASE16("fooba") = "666F6F6261" | |||
BASE16("foobar") = "GGGPGPGCGBHC" | BASE16("foobar") = "666F6F626172" | |||
11. ISO C99 Implementation of Base64 | 11. ISO C99 Implementation Of Base64 | |||
Below is an ISO C99 implementation of Base64 encoding and decoding. | Below is an ISO C99 implementation of Base64 encoding and decoding. | |||
The code assume that the US-ASCII characters are encoding inside | The code assume that the US-ASCII characters are encoding inside | |||
'char' with values below 255, which holds for all POSIX platforms, | 'char' with values below 255, which holds for all POSIX platforms, | |||
but should otherwise be portable. This code is not intended as a | but should otherwise be portable. This code is not intended as a | |||
normative specification of base64. | normative specification of base64. | |||
11.1. Prototypes: base64.h | 11.1. Prototypes: base64.h | |||
/* base64.h -- Encode binary data using printable characters. | /* base64.h -- Encode binary data using printable characters. | |||
skipping to change at page 21, line 31 | skipping to change at page 22, line 31 | |||
} | } | |||
/* Decode base64 encoded input array IN of length INLEN to | /* Decode base64 encoded input array IN of length INLEN to | |||
output array OUT that can hold *OUTLEN bytes. Return | output array OUT that can hold *OUTLEN bytes. Return | |||
true if decoding was successful, i.e. if the input was | true if decoding was successful, i.e. if the input was | |||
valid base64 data, false otherwise. If *OUTLEN is too | valid base64 data, false otherwise. If *OUTLEN is too | |||
small, as many bytes as possible will be written to OUT. | small, as many bytes as possible will be written to OUT. | |||
On return, *OUTLEN holds the length of decoded bytes in | On return, *OUTLEN holds the length of decoded bytes in | |||
OUT. Note that as soon as any non-alphabet characters | OUT. Note that as soon as any non-alphabet characters | |||
are encountered, decoding is stopped and false is | are encountered, decoding is stopped and false is | |||
returned. */ | returned. This means that, when applicable, you must | |||
remove any line terminators that is part of the data | ||||
stream before calling this function. */ | ||||
bool | bool | |||
base64_decode (const char *restrict in, size_t inlen, | base64_decode (const char *restrict in, size_t inlen, | |||
char *restrict out, size_t *outlen) | char *restrict out, size_t *outlen) | |||
{ | { | |||
size_t outleft = *outlen; | size_t outleft = *outlen; | |||
while (inlen >= 2) | while (inlen >= 2) | |||
{ | { | |||
if (!isbase64 (in[0]) || !isbase64 (in[1])) | if (!isbase64 (in[0]) || !isbase64 (in[1])) | |||
break; | break; | |||
skipping to change at page 24, line 33 | skipping to change at page 25, line 35 | |||
when, e.g., a user reports details of a network protocol exchange | when, e.g., a user reports details of a network protocol exchange | |||
(perhaps to illustrate some other problem) and accidentally reveals | (perhaps to illustrate some other problem) and accidentally reveals | |||
the password because she is unaware that the base encoding does not | the password because she is unaware that the base encoding does not | |||
protect the password. | protect the password. | |||
Base encoding adds no entropy to the plaintext, but it does increase | Base encoding adds no entropy to the plaintext, but it does increase | |||
the amount of plaintext available and provides a signature for | the amount of plaintext available and provides a signature for | |||
cryptanalysis in the form of a characteristic probability | cryptanalysis in the form of a characteristic probability | |||
distribution. | distribution. | |||
13. Changes since RFC 3548 | 13. Changes Since RFC 3548 | |||
Added the "base32 extended hex alphabet", needed to preserve sort | Added the "base32 extended hex alphabet", needed to preserve sort | |||
order of encoded data. | order of encoded data. | |||
Reference IMAP for the special Base64 encoding used there. | Reference IMAP for the special Base64 encoding used there. | |||
Fix the example copied from RFC 2440. | Fix the example copied from RFC 2440. | |||
Add security consideration about providing a signature for | Add security consideration about providing a signature for | |||
cryptoanalysis. | cryptoanalysis. | |||
skipping to change at page 25, line 4 | skipping to change at page 26, line 6 | |||
Fix the example copied from RFC 2440. | Fix the example copied from RFC 2440. | |||
Add security consideration about providing a signature for | Add security consideration about providing a signature for | |||
cryptoanalysis. | cryptoanalysis. | |||
Add test vectors and C99 implementation. | Add test vectors and C99 implementation. | |||
Typo fixes. | Typo fixes. | |||
14. Acknowledgements | 14. Acknowledgements | |||
Several people offered comments and/or suggestions, including John E. | Several people offered comments and/or suggestions, including John E. | |||
Hadstate, Tony Hansen, Gordon Mohr, John Myers, Chris Newman and | Hadstate, Tony Hansen, Gordon Mohr, John Myers, Chris Newman and | |||
Andrew Sieber. Text used in this document are based on earlier RFCs | Andrew Sieber. Text used in this document are based on earlier RFCs | |||
describing specific uses of various base encodings. The author | describing specific uses of various base encodings. The author | |||
acknowledges the RSA Laboratories for supporting the work that led to | acknowledges the RSA Laboratories for supporting the work that led to | |||
this document. | this document. | |||
This revised version is based in parts on comments and/or suggestions | This revised version is based in parts on comments and/or suggestions | |||
made by Roy Arends, Ted Hardie, Per Hygum, Jelte Jansen, Clement | made by Roy Arends, Eric Blake, Elwyn Davies, Ted Hardie, Per Hygum, | |||
Kent, Paul Kwiatkowski, and Ben Laurie. | Jelte Jansen, Clement Kent, Paul Kwiatkowski, and Ben Laurie. | |||
15. Copying conditions | 15. Copying Conditions | |||
Copyright (c) 2000-2006 Simon Josefsson | ||||
Regarding the abstract and section 1, 3, 8, 10, 12, 13, and 14 of | Regarding the abstract and section 1, 3, 8, 10, 12, 13, and 14 of | |||
this document, that were written by Simon Josefsson ("the author", | this document, that were written by Simon Josefsson ("the author", | |||
for the remainder of this section), the author makes no guarantees | for the remainder of this section), the author makes no guarantees | |||
and is not responsible for any damage resulting from its use. The | and is not responsible for any damage resulting from its use. The | |||
author grants irrevocable permission to anyone to use, modify, and | author grants irrevocable permission to anyone to use, modify, and | |||
distribute it in any way that does not diminish the rights of anyone | distribute it in any way that does not diminish the rights of anyone | |||
else to use, modify, and distribute it, provided that redistributed | else to use, modify, and distribute it, provided that redistributed | |||
derivative works do not contain misleading author or version | derivative works do not contain misleading author or version | |||
information. Derivative works need not be licensed under similar | information. Derivative works need not be licensed under similar | |||
terms. | terms. | |||
16. References | 16. References | |||
16.1. Normative References | 16.1. Normative References | |||
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement | [1] Cerf, V., "ASCII format for network interchange", RFC 20, | |||
October 1969. | ||||
[2] Bradner, S., "Key words for use in RFCs to Indicate Requirement | ||||
Levels", BCP 14, RFC 2119, March 1997. | Levels", BCP 14, RFC 2119, March 1997. | |||
16.2. Informative References | 16.2. Informative References | |||
[2] Cerf, V., "ASCII format for network interchange", RFC 20, | ||||
October 1969. | ||||
[3] Linn, J., "Privacy Enhancement for Internet Electronic Mail: | [3] Linn, J., "Privacy Enhancement for Internet Electronic Mail: | |||
Part I: Message Encryption and Authentication Procedures", | Part I: Message Encryption and Authentication Procedures", | |||
RFC 1421, February 1993. | RFC 1421, February 1993. | |||
[4] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [4] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
Extensions (MIME) Part One: Format of Internet Message Bodies", | Extensions (MIME) Part One: Format of Internet Message Bodies", | |||
RFC 2045, November 1996. | RFC 2045, November 1996. | |||
[5] Callas, J., Donnerhacke, L., Finney, H., and R. Thayer, | [5] Callas, J., Donnerhacke, L., Finney, H., and R. Thayer, | |||
"OpenPGP Message Format", RFC 2440, November 1998. | "OpenPGP Message Format", RFC 2440, November 1998. | |||
[6] Eastlake, D., "Domain Name System Security Extensions", | [6] Eastlake, D., "Domain Name System Security Extensions", | |||
RFC 2535, March 1999. | RFC 2535, March 1999. | |||
[7] Klyne, G. and L. Masinter, "Identifying Composite Media | [7] Klyne, G. and L. Masinter, "Identifying Composite Media | |||
Features", RFC 2938, September 2000. | Features", RFC 2938, September 2000. | |||
[8] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION | [8] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION | |||
4rev1", RFC 3501, March 2003. | 4rev1", RFC 3501, March 2003. | |||
[9] Myers, J., "SASL GSSAPI mechanisms", Work in | [9] Laurie, B., "DNSSEC Hash Authenticated Denial of Existence", | |||
draft-ietf-dnsext-nsec3-04 (work in progress), March 2006. | ||||
[10] Myers, J., "SASL GSSAPI mechanisms", Work in | ||||
progress draft-ietf-cat-sasl-gssapi-01, May 2000. | progress draft-ietf-cat-sasl-gssapi-01, May 2000. | |||
[10] Wilcox-O'Hearn, B., "Post to P2P-hackers mailing list", World | [11] Wilcox-O'Hearn, B., "Post to P2P-hackers mailing list", World | |||
Wide Web http://zgp.org/pipermail/p2p-hackers/2001-September/ | Wide Web http://zgp.org/pipermail/p2p-hackers/2001-September/ | |||
000315.html, September 2001. | 000315.html, September 2001. | |||
Author's Address | Author's Address | |||
Simon Josefsson | Simon Josefsson | |||
SJD | ||||
Email: simon@josefsson.org | Email: simon@josefsson.org | |||
Intellectual Property Statement | Intellectual Property Statement | |||
The IETF takes no position regarding the validity or scope of any | The IETF takes no position regarding the validity or scope of any | |||
Intellectual Property Rights or other rights that might be claimed to | Intellectual Property Rights or other rights that might be claimed to | |||
pertain to the implementation or use of the technology described in | pertain to the implementation or use of the technology described in | |||
this document or the extent to which any license under such rights | this document or the extent to which any license under such rights | |||
might or might not be available; nor does it represent that it has | might or might not be available; nor does it represent that it has | |||
End of changes. 49 change blocks. | ||||
108 lines changed or deleted | 130 lines changed or added | |||
This html diff was produced by rfcdiff 1.29, available from http://www.levkowetz.com/ietf/tools/rfcdiff/ |