goofansu
goofansu

Reputation: 2277

How can I fetch a whole string that include Chinese character with ruby?

For example, 1.txt

a = "攻击力
非常高"

b = "防御力"
c = "防御力是#{example}"
d = "xyz"

I want the result:

"攻击力
非常高"

"防御力"

"防御力是#{example}"

And there is no "xyz" because it contains no Chinese character.

I tested /(\p{Han}+)/, but it is not what I want.

Thank you in advance.

Here is my example: regex example

Upvotes: 2

Views: 850

Answers (2)

steenslag
steenslag

Reputation: 80085

Keeping the regex as simple as possible:

# encoding: utf-8
a = "攻击力
非常高"

b = "防御力"
c = "防御力是example"
d = "xyz"

puts [a,b,c,d].select{|str| str =~ /\p{Han}/ }
# 攻击力
# 非常高
# 防御力
# 防御力是example

or, in case of one string:

# encoding: utf-8
a = "攻击力非常高
防御力
防御力是example
xyz"
puts a.lines.select{|line| line =~ /\p{Han}/ }.join

Upvotes: 1

Boris Strandjev
Boris Strandjev

Reputation: 46953

This might help you: /([^[:ascii:]]+)/ a regex that selects all non-ascii symbols in the input. I tried it on your example and it really selects only the Chinese characters.

The regex you are searching for is probably:

/("[^"]*[^[:ascii:]]+[^"]*")/

If I got correctly what you need.

What I do:

  • String should start with " #"#[^"]*[^[:ascii:]]+[^"]*")
  • Then have any number of non " characters "#[^"]*#[^[:ascii:]]+[^"]*")
  • Then at least one non-ascii symbol "[^"]*#[^[:ascii:]]+#[^"]*")
  • Then have any number of non " characters "[^"]*[^[:ascii:]]+#[^"]*#")
  • And should end with " "[^"]*[^[:ascii:]]+[^"]*#"#)

Upvotes: 2

Related Questions