Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Advice on regexp - Non word characters
#1
I am sorry, but I am in need to ask again question on regexp. I am afraid I am still studying and learning this issue. The question : I need to replace non word characters. I failed using \w in that it cannot detect unicode characters in greek character set. Therefore, I by-passed this problem using the following expression :

st.replacerx("\d|\,|\.|\\|\: |\(|\)|\_|\%|\$" " ")

I wonder whether there exists a simpler way to do it. Many thanks in advance.
#2
QM regular expressions cannot recognize Unicode word/nonword characters.
If all Unicode characters in your text are word characters, use this regular expression:

Macro Macro272
Code:
Copy      Help
str s="Liepa ąžoulas beržas"
s.replacerx("[^\w\x80-\xff]" "-")
out s
#3
Another option - use C# Regex class.

Macro Macro274
Code:
Copy      Help
str s="Liepa ąžoulas beržas●"
s=CsFunc("" s "\W" "-")
out s


#ret
using System.Text.RegularExpressions;

public static string RegexReplace(string s, string rx, string repl)
{
return Regex.Replace(s, rx, repl);
}
#4
Thank you !


Forum Jump:


Users browsing this thread: 1 Guest(s)