| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rb_enc_get_index.
(rb_enc_internal_set_index): extracted from rb_enc_associate_index
* include/ruby/encoding.h (ENCODING_SET): work over ENCODING_INLINE_MAX.
(ENCODING_GET): ditto.
(ENCODING_IS_ASCII8BIT): defined.
(ENCODING_CODERANGE_SET): defined.
* re.c (rb_reg_fixed_encoding_p): use ENCODING_IS_ASCII8BIT.
* string.c (rb_enc_str_buf_cat): use ENCODING_IS_ASCII8BIT.
* parse.y (reg_fragment_setenc_gen): use ENCODING_IS_ASCII8BIT.
* marshal.c (has_ivars): use ENCODING_IS_ASCII8BIT.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
for ASCII-8BIT regexp in non ASCII-8BIT script.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14911 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* string.c (coderange_scan): extracted from rb_enc_str_coderange.
(rb_enc_str_coderange): use coderange_scan.
(rb_str_shared_replace): copy encoding and coderange.
(rb_enc_str_buf_cat): new function for linear complexity string
accumulation with encoding.
(rb_str_sub_bang): don't conflict substituted part and replacement.
(str_gsub): use rb_enc_str_buf_cat.
(rb_str_clear): clear coderange.
* re.c (rb_reg_regsub): use rb_enc_str_buf_cat.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14910 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
encoding is EUC-JP.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14899 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1.8. [ruby-core:14583]
* include/ruby/intern.h, re.c (rb_reg_new_str): renamed, and defines
HAVE_RB_REG_NEW_STR macro to tell if it is available.
* include/ruby/encoding.h (rb_enc_reg_new): added.
* insns.def (toregexp), marshal.c (r_object0): use rb_reg_new_str().
* re.c (rb_reg_regcomp, rb_reg_s_union): ditto.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14884 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
support invalid encoding.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14880 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14879 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* regexec.c: unset USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE
which is turned on since oniguruma 5.9.1.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14878 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14810 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
| |
changed.
* string.c (rb_str_sub_bang): keeps code-range as possible.
* string.c (str_gsub): adjusts code-range. [ruby-core:14566]
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14782 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14734 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14597 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
| |
ruby.c, transcode.c: rename rb_ascii_encoding. to
rb_ascii8bit_encoding. rb_ascii_encoding is ambiguous with
ASCII-8BIT and US-ASCII.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14504 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14475 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
rb_default_encoding().
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14443 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
encoding of the str is ASCII-8BIT.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14442 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
(REG_ENCODING_NONE): ditto.
(rb_char_to_option_kcode): return ARG_ENCODING_NONE for n.
(rb_reg_prepare_re): warn /ascii/n =~ "non-ascii".
(rb_reg_initialize): set REG_ENCODING_NONE from ARG_ENCODING_NONE.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14438 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
rb_enc_find("utf-8").
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14412 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
ASCII-8BIT encoding explicitly.
* re.c (rb_reg_prepare_re): add encoding name in the message.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14402 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14401 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
* re.c (reg_match_pos): pos adjustment should be based on
characters.
* test/ruby/test_m17n.rb (TestM17N::test_str_insert): test updated
to check negative offset behavior.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* string.c (rb_str_sub_bang): applied r14212 too.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
| |
literal is used.
(reg_named_capture_assign_gen): assign the result of named capture
into local variables.
[ruby-dev:32588]
* re.c: document the assignment by named captures.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14297 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
encoding option is specified for regexp literals with \u{}
escapes.
* string.c (rb_str_squeeze_bang): should squeeze multibyte
characters as well.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14275 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
it's done inside of re_reg_seach().
* string.c (rb_str_split_m): ditto.
* re.c (rb_reg_regsub): ditto.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
| |
of the regular expression.
* re.c (rb_reg_initialize): fix encoding of regular expression if
embedded string has its own encoding specified.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14218 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
to ASCII-8BIT unless both encodings are ASCII-8BIT.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
use capital letter for \xHH notation. [ruby-dev:32511]
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14202 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
* string.c (rb_str_sub_bang, str_gsub): should check and copy encoding
to be replaced.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
* include/ruby/encoding.h: follow the renaming.
* re.c: ditto.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14195 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
length of the returned character.
* include/ruby/encoding.h (rb_enc_get_ascii): add the argument.
* re.c (rb_reg_expr_str): modify rb_enc_get_ascii call.
(rb_reg_quote): ditto.
(rb_reg_regsub): ditto.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14190 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
operand. [ruby-cvs:21416]
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14180 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
| |
* re.c (rb_reg_match, rb_reg_match2, rb_reg_match_m): convert byte
offset to char index.
* string.c (rb_str_index): return byte offset. [ruby-dev:32472]
* string.c (rb_str_split_m): calculate in byte offset.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14171 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regerror.c (to_ascii): ditto.
(onig_snprintf_with_pattern): ditto.
(onig_snprintf_with_pattern): ditto.
* string.c (rb_str_inspect): ditto.
(rb_str_dump): ditto.
* parse.y (parser_yylex): ditto.
* ruby.c (proc_options): ditto.
* file.c (rb_f_test): ditto.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
| |
(rb_reg_named_captures): new method Regexp#named_captures
(match_regexp): new method MatchData#regexp.
(match_names): new method MatchData#names.
* lib/pp.rb (MatchData#pretty_print): show names of named captures.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14163 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14161 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14160 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14159 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
| |
name/number to an integer.
(match_offset): use match_backref_number.
(match_begin): ditto.
(match_end): ditto.
(name_to_backref_number): raise IndexError instead of RuntimeError.
(match_inspect): show capture index.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14158 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
| |
fragment.
* parse.y (regexp): invoke reg_fragment_check.
(reg_fragment_check): defined.
(reg_fragment_check_gen): defined.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14133 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(rb_enc_nth): don't check the return value of rb_enc_mbclen.
(rb_enc_strlen): ditto.
(rb_enc_precise_mbclen): return needmore(1) if e <= p.
(rb_enc_get_ascii): new function for extracting ASCII character.
* include/ruby/encoding.h (rb_enc_get_ascii): declared.
* include/ruby/regex.h (ismbchar): removed.
* re.c (rb_reg_expr_str): use rb_enc_get_ascii.
(unescape_escaped_nonascii): use rb_enc_precise_mbclen to determine
the termination of escaped non-ASCII character.
(unescape_nonascii): use rb_enc_precise_mbclen.
(rb_reg_quote): use rb_enc_get_ascii.
(rb_reg_regsub): use rb_enc_get_ascii.
* string.c (rb_str_reverse) don't check the return value of
rb_enc_mbclen.
(rb_str_split_m): don't call rb_enc_mbclen with e <= p.
* parse.y (is_identchar): use ISASCII.
(parser_ismbchar): removed.
(parser_precise_mbclen): new macro.
(parser_isascii): new macro.
(parser_tokadd_mbchar): use parser_precise_mbclen to check invalid
character precisely.
(parser_tokadd_string): use parser_isascii.
(parser_yylex): ditto.
(is_special_global_name): don't call is_identchar with e <= p.
(rb_enc_symname_p): ditto.
[ruby-dev:32455]
* ext/tk/sample/tkextlib/vu/canvSticker2.rb: remove coding cookie
because the encoding is not UTF-8. [ruby-dev:32475]
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14131 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14084 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14078 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* re.c (rb_reg_preprocess): new function for dynamic regexp with
\u{} such as Regexp.new("\\u{6666}").
(rb_reg_prepare_re): preprocess regexp for recompiling.
(read_escaped_byte): new function.
(unescape_escaped_nonascii): new function.
(append_utf8): new function.
(unescape_unicode_list): new function.
(unescape_unicode_bmp): new function.
(unescape_nonascii): new function.
(rb_reg_initialize): preprocess regexp.
* pack.c (rb_uv_to_utf8): renamed from uv_to_utf8.
* parse.y (STR_NEW3): take func instead of has8 and hasmb.
(parser_str_new): use default coderange mechanism except for regexp.
(parser_tokadd_utf8): copy regexp source as-is.
(parser_read_escape): UTF-8 stuff removed.
(parser_tokadd_escape): has8bit and hasmb removed.
(parser_tokadd_string): fix 8-bit single byte character with \u.
(parser_parse_string): has8bit and hasmb removed.
(parser_here_document): has8bit and hasmb removed.
(parser_yylex): call parser_tokadd_utf8 instead of read_escape for
UTF-8 character.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
rename ENC_CODERANGE_SINGLE to ENC_CODERANGE_7BIT.
rename ENC_CODERANGE_MULTI to ENC_CODERANGE_8BIT.
Because single byte 8bit character, such as Shift_JIS 1byte katakana,
is represented by ENC_CODERANGE_MULTI even if it is not multi byte.
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14027 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: http://svn.ruby-lang.org/repos/ruby/trunk@14021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|