summaryrefslogtreecommitdiffstats
path: root/transcode.c
Commit message (Collapse)AuthorAgeFilesLines
* * encoding.c (enc_init_db): moved to enc/encdb.c.nobu2008-04-071-9/+0
| | | | | | | | | | | | * transcode.c (init_transcoder_table): moved to enc/trans/transdb.c. * enc/depend (enc/encdb.o enc/trans/transdb.o): depend on corresponding headers. * common.mk (COMMONOBJS): moved transcode.o from OBJS git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (rb_str_transcode_bang): set coderange.naruse2008-03-061-13/+12
| | | | | | * transcode.c (rb_str_transcode): use rb_str_transcode_bang. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15707 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Web Mar 5 17:43:43 2008 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2008-03-051-9/+15
| | | | | | | | | | | | * transcode.c (transcode_loop): Adjusted detection of invalid (ill-formed) UTF-8 sequences. Fixing potential security issue, see http://www.unicode.org/versions/Unicode5.1.0/#Notable_Changes. * test/ruby/test_transcode.rb: Added two tests for above fix. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15692 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Thu Feb 21 17:15:15 2008 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2008-02-211-4/+29
| | | | | | | | | | | | | | | * transcode.c: Added basic support for passing options to String#encode via a hash. Currently only one option, with one value, is supported: invalid: :ignore (dropping invalid byte sequences instead of producing an error). Option naming is not yet stable! * test/ruby/test_transcode.rb: Added a single test for invalid: :ignore option. Not more tests because most data does not yet distinguish between INVALID and UNKNOWN. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15565 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/trans/japanese.c (rb_to_Windows_31J): to 'Windows-31J'.naruse2008-01-291-23/+3
| | | | | | | | | * common.mk: add rules for transdb.h. * transcode.c (init_transcoder_table): use transdb.h. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15317 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Mon Jan 21 19:42:42 2008 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2008-01-211-0/+2
| | | | | | | | | * transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb: added UTF-32BE and UTF-32LE conversions. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15156 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (str_transcode): initialize transcoder innobu2008-01-211-2/+2
| | | | | | | rb_transcoding. [ruby-dev:33234] git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15153 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (str_transcode): initialize transcoder innobu2008-01-211-0/+2
| | | | | | | | | rb_transcoding. [ruby-dev:33234] * transcode_data.h (rb_transcoding): transcoder constified. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15152 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (transcode_loop, str_transcoding_resize): use unsignednobu2008-01-211-21/+21
| | | | | | | | | | | | | | char. [ruby-dev:33232] * transcode_data.h (rb_transcoding, rb_transcoder): removed callback parameters. * enc/trans/japanese.c: ditto. * enc/trans/utf_16_32.c: parenthesized bit-or operands. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15150 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (transcode_dispatch): constified return value.nobu2008-01-201-9/+9
| | | | | | | | | | | | | * transcode_data.h (rb_transcoding): include pointer to rb_transcoder and auxiliary data. * transcode_data.h (rb_transcoder): all callback functions shoud have their own parameters. * enc/trans/{japanese,single_byte}.c: constified. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Sun Jan 20 20:00:20 2008 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2008-01-201-0/+1
| | | | | | | | | | | * transcode.c, enc/trans/utf_16_32.c, test/ruby/test_transcode.rb: added UTF-16LE conversions. * fixed changelog for last commit git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15144 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Sun Jan 20 15:08:08 2008 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2008-01-201-0/+13
| | | | | | | | | | | | | | | * enc/trans/utf_16_32.c: new file, currently implementing UTF-16BE conversions only. * test/ruby/test_transcode.rb: Added tests for UTF-16BE; made check_both_ways() use force_encoding differently. * transcode_data.h, transcode.c: Support for more conversion functions. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15142 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/intern.h (rb_str_tmp_new, rb_str_shared_replace):nobu2008-01-161-3/+0
| | | | | | | prototype moved. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@15072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * $Date$ keyword removed to avoid inclusion of locale dependentakr2008-01-061-1/+0
| | | | | | | string. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/encoding.h (rb_isascii): defined.akr2008-01-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (rb_isalnum): ditto. (rb_isalpha): ditto. (rb_isblank): ditto. (rb_iscntrl): ditto. (rb_isdigit): ditto. (rb_isgraph): ditto. (rb_islower): ditto. (rb_isprint): ditto. (rb_ispunct): ditto. (rb_isspace): ditto. (rb_isupper): ditto. (rb_isxdigit): ditto. (rb_tolower): ditto. (rb_toupper): ditto. * include/ruby/st.h (st_strcasecmp): declared. (st_strncasecmp): ditto. * st.c (type_strcasehash): use st_strcasecmp instead of strcasecmp. (st_strcasecmp): defined. (st_strncasecmp): ditto. * include/ruby/ruby.h: include include/ruby/encoding.h. (ISASCII): use rb_isascii. (ISPRINT): use rb_isprint. (ISSPACE): use rb_isspace. (ISUPPER): use rb_isupper. (ISLOWER): use rb_islower. (ISALNUM): use rb_isalnum. (ISALPHA): use rb_isalpha. (ISDIGIT): use rb_isdigit. (ISXDIGIT): use rb_isxdigit. (TOUPPER): defined. (TOLOWER): ditto. (STRCASECMP): ditto. (STRNCASECMP): ditto. * dir.c, encoding.c, file.c, hash.c, process.c, ruby.c, time.c, transcode.c, ext/readline/readline.c: use locale insensitive functions. [ruby-core:14662] git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14829 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Fri Dec 28 01:55:04 2007 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2007-12-281-21/+9
| | | | | | | | | | | | | | | | | | | | | * transcode.c (transcode_dispatch): reverted some of the changes in r14746. * transcode.c, enc/trans/single_byte.c: Added conversions to/from US-ASCII and ASCII-8BIT (using data tables). * enc/trans/single_byte.c: Some spacing/ordering changes due to automatic data file generation. * transcode_data.h, transcode.c: Preliminary code for using micro-conversion functions. * test/ruby/test_transcode.rb: Added some tests for US-ASCII and ASCII-8BIT conversions. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (transcode_dispatch): allows transcoding from/tonobu2007-12-271-11/+27
| | | | | | | ASCII-8BIT. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14746 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * parse.y, transcode_data.h, transcode.c: change "illegal" toakr2007-12-271-8/+8
| | | | | | | | "invalid" in a context which doesn' t against a law. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14735 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (transcode_dispatch): fix for multistep transcode.nobu2007-12-251-2/+4
| | | | git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * common.mk (COMMONOBJS): transcode_data_*.c moved under enc/trans.nobu2007-12-251-151/+109
| | | | | | | | | | | | | | | | | | * transcode_data.h (rb_transcoding, rb_transcoder): prefixed. * transcode.c (rb_register_transcoder, rb_declare_transcoder): split declaration and registration. [ruby-dev:32704] * transcode.c (transcode_dispatch): autoload pre-declared transcoder. * transcode.c (str_transcode): use rb_define_dummy_encoding(). * transcode.c (Init_transcode): initialize transcoder tables. * enc/trans/single_byte.c, enc/trans/japanese.c: moved from top. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Tue Dec 25 12:32:32 2007 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2007-12-251-20/+20
| | | | | | | | | | * transcode.c: Moving a static counter from inside register_transcoder() and register_functional_transcoder() to outside the functions, renaming from n to next_transcoder_position. Fixes 3) in [ruby-dev:32715]. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transocode.c: register_functional_transcoder() added.naruse2007-12-241-18/+66
| | | | | | | | | | | | | (init_transcoder_table(: register ISO-2022-JP. (str_transcode): add preprocessor and postprocessor. * transcode_data_japanese.c: add ISO-2022-JP support. * transcode_data.h: moved transcoder and transcoding difinition from transcode.c. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14607 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Mon Dec 24 09:45:45 2007 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2007-12-241-68/+68
| | | | | | | | | * transcode.c, transcode_data_one_byte.c, transcode_data_japanese.c: added rb_ prefix to external data symbols. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14561 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/encoding.h, encoding.c, re.c, io.c, parse.y, numeric.c,akr2007-12-221-1/+1
| | | | | | | | | ruby.c, transcode.c: rename rb_ascii_encoding. to rb_ascii8bit_encoding. rb_ascii_encoding is ambiguous with ASCII-8BIT and US-ASCII. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Sat Dec 22 15:45:45 2007 Martin Duerst <duerst@it.aoyama.ac.jp>duerst2007-12-221-1/+13
| | | | | | | | | | | | | | * transcode_data_one_byte: slightly optimized * transcode_data_japanese: new data file for EUC-JP and SHIFT_JIS (not yet optimized; tests to follow; data from http://nkf.sourceforge.jp/ucm/{SJIS|eucJP}-nkf.ucm) * common.mk, transcode.c: Adjusted for transcode_data_japanese git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14472 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * encoding.c (rb_ascii_encoding): renamed from previousmatz2007-12-211-1/+1
| | | | | | rb_default_encoding(). git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (rb_str_transcode_bang): returns self if no conversion.nobu2007-12-211-1/+1
| | | | | | | [ruby-dev:32662] git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14425 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (rb_str_transcode_bang, rb_str_transcode): set newnobu2007-12-181-17/+27
| | | | | | | | encoding even if no conversion is done because of 7bit only. [ruby-dev:32591] git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14293 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* for undefined conversions.matz2007-12-171-2/+1
| | | | | | | | | | | | * transcode_data_iso_8859.c: Changed from character constants ('\xC2') to integer contants (0xC2) for shorter files and better readability; eliminated duplicated tables; changed from -1 offset to actual UNDEF entry (not yet distinguishing UNDEF and ILLEGAL correctly). * test/ruby/test_transcode.rb: added a test for UNDEF conversion. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14251 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (str_transcode, transcode_dispatch): added two-stepmatz2007-12-151-31/+55
| | | | | | | | | | | | | | * trancode.c: some minor formatting fixes * transcode_data.h, transcode_data_iso_8859.c: Shortened extremely frequently used macros to shorten file length. * test/ruby/test_transcode.rb: Fixed name of test class; added setup method to ensure all necessary encodings exist; split tests into more test methods; added tests; fixed ordering of arguments in assert_equal to have expected result first. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14236 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (transcode_loop): get rid of SEGV at sequence can not benobu2007-12-111-15/+17
| | | | | | | | | converted. * transcode.c (rb_str_transcode_bang): copy encoding. [ruby-dev:32532] git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14191 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c (str_transcode): allow non-registered encodings.nobu2007-12-101-7/+28
| | | | | | | [ruby-dev:32520] git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14182 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (rb_str_tmp_new): creates hidden temporary buffer.nobu2007-12-101-183/+102
| | | | | | | | | | | | | | * transcode.c (transcoding): added a pointer to function to flush. * transcode.c (transcode_loop): do not use string internal. [ruby-dev:32512] * transcode.c (str_transcode): allow Encoding objects. * transcode_data.h (BYTE_LOOKUP): use actual struct name. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14176 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode*.[ch], test/ruby/test_transcode.rb: set properties.nobu2007-12-101-2/+2
| | | | git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14175 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * transcode.c: new file to provide encoding conversion features.matz2007-12-101-0/+438
code contributed by Martin Duerst. git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14172 b2dd03c8-39d4-4d8f-98ff-823fe69b080e