Fix for #4832 -- Making PSON handle arbitrary binary data

The PSON library needlessly assumed that the data to be transmitted was well- formed unicode. This made Latin-1 users (and anyone who needed to serialize arbitrary binary data) sad. This patch goes some of the way to resolving the issues, by passing through non-unicode data rather than just failing, adds tests, and cleans up a pernicious assumption about escape characters in ruby regular expressions not marked "n" (no-encoding).
author: Markus Roberts <Markus@reality.com> 2010-10-12 16:38:59 -0700
committer: Markus Roberts <Markus@reality.com> 2010-10-13 16:49:53 -0700
commit: 3c56705a95c945778674f9792a07b66b879cb48e (patch)
tree: 76c0a5807b3ad64138d24b6f189b0dda60255bdf /lib
parent: e232770baefc35abb71de6e2f28d053158e6dd45 (diff)
1 files changed, 5 insertions, 12 deletions
diff --git a/lib/puppet/external/pson/pure/generator.rb b/lib/puppet/external/pson/pure/generator.rb
index ef8b36d31..4180be57d 100644
--- a/lib/puppet/external/pson/pure/generator.rb
+++ b/lib/puppet/external/pson/pure/generator.rb
@@ -63,22 +63,15 @@ module PSON
     end
   else
     def utf8_to_pson(string) # :nodoc:
-      string = string.gsub(/["\\\x0-\x1f]/) { MAP[$MATCH] }
-      string.gsub!(/(
-        (?:
+      string.
+        gsub(/["\\\x0-\x1f]/n) { MAP[$MATCH] }.
+        gsub(/((?:
           [\xc2-\xdf][\x80-\xbf]    |
           [\xe0-\xef][\x80-\xbf]{2} |
           [\xf0-\xf4][\x80-\xbf]{3}
-            )+ |
-            [\x80-\xc1\xf5-\xff]       # invalid
-              )/nx) { |c|
-        c.size == 1 and raise GeneratorError, "invalid utf8 byte: '#{c}'"
-        s = PSON::UTF8toUTF16.iconv(c).unpack('H*')[0]
-        s.gsub!(/.{4}/n, '\\\\u\&')
+            )+)/nx) { |c|
+        PSON::UTF8toUTF16.iconv(c).unpack('H*')[0].gsub(/.{4}/n, '\\\\u\&')
       }
-      string
-    rescue Iconv::Failure => e
-      raise GeneratorError, "Caught #{e.class}: #{e}"
     end
   end
   module_function :utf8_to_pson
author	Markus Roberts <Markus@reality.com>	2010-10-12 16:38:59 -0700
committer	Markus Roberts <Markus@reality.com>	2010-10-13 16:49:53 -0700
commit	3c56705a95c945778674f9792a07b66b879cb48e (patch)
tree	76c0a5807b3ad64138d24b6f189b0dda60255bdf /lib
parent	e232770baefc35abb71de6e2f28d053158e6dd45 (diff)