diff options
| author | Markus Roberts <Markus@reality.com> | 2010-10-12 16:38:59 -0700 |
|---|---|---|
| committer | Markus Roberts <Markus@reality.com> | 2010-10-13 16:49:53 -0700 |
| commit | 3c56705a95c945778674f9792a07b66b879cb48e (patch) | |
| tree | 76c0a5807b3ad64138d24b6f189b0dda60255bdf /lib | |
| parent | e232770baefc35abb71de6e2f28d053158e6dd45 (diff) | |
Fix for #4832 -- Making PSON handle arbitrary binary data
The PSON library needlessly assumed that the data to be transmitted was well-
formed unicode. This made Latin-1 users (and anyone who needed to serialize
arbitrary binary data) sad. This patch goes some of the way to resolving the
issues, by passing through non-unicode data rather than just failing, adds
tests, and cleans up a pernicious assumption about escape characters in ruby
regular expressions not marked "n" (no-encoding).
Diffstat (limited to 'lib')
| -rw-r--r-- | lib/puppet/external/pson/pure/generator.rb | 17 |
1 files changed, 5 insertions, 12 deletions
diff --git a/lib/puppet/external/pson/pure/generator.rb b/lib/puppet/external/pson/pure/generator.rb index ef8b36d31..4180be57d 100644 --- a/lib/puppet/external/pson/pure/generator.rb +++ b/lib/puppet/external/pson/pure/generator.rb @@ -63,22 +63,15 @@ module PSON end else def utf8_to_pson(string) # :nodoc: - string = string.gsub(/["\\\x0-\x1f]/) { MAP[$MATCH] } - string.gsub!(/( - (?: + string. + gsub(/["\\\x0-\x1f]/n) { MAP[$MATCH] }. + gsub(/((?: [\xc2-\xdf][\x80-\xbf] | [\xe0-\xef][\x80-\xbf]{2} | [\xf0-\xf4][\x80-\xbf]{3} - )+ | - [\x80-\xc1\xf5-\xff] # invalid - )/nx) { |c| - c.size == 1 and raise GeneratorError, "invalid utf8 byte: '#{c}'" - s = PSON::UTF8toUTF16.iconv(c).unpack('H*')[0] - s.gsub!(/.{4}/n, '\\\\u\&') + )+)/nx) { |c| + PSON::UTF8toUTF16.iconv(c).unpack('H*')[0].gsub(/.{4}/n, '\\\\u\&') } - string - rescue Iconv::Failure => e - raise GeneratorError, "Caught #{e.class}: #{e}" end end module_function :utf8_to_pson |
