Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Escape Regex in Text
#1
Hi Gintaras,
Sometimes I am inserting text into a string that will be interpreted as a regular expression e.g. with RichEditHighlight so I want to make sure that literals will be marked off as such. I wrote this function to perform this:
Function EscapeRegexLiterals
Code:
Copy      Help
function str&TextToRegexEscape
out "Pre Regex-Escaping: %s" TextToRegexEscape
str LiteralsPattern = "\\|\^|\$|\*|\+|\?|\.|\(|\)|\[|\]|\{|\}|\||\-"
ARRAY(CHARRANGE) arrLiterals
int i
if(findrx(TextToRegexEscape LiteralsPattern 0 4 arrLiterals)<0) out "does not match"; ret

str FirstCharacter.left(TextToRegexEscape 1)
out FirstCharacter
if findrx(FirstCharacter LiteralsPattern) > -1
,TextToRegexEscape.insert("\" 0);; have to start with this manually because of 0 order
,for i 0 arrLiterals.len
,,int offset(arrLiterals[0 i].cpMin) length(arrLiterals[0 i].cpMax-arrLiterals[0 i].cpMin)
,,int AdjustedOffset = offset + i
,,if i > 0
,,,TextToRegexEscape.insert("\" AdjustedOffset)
else
,for i 0 arrLiterals.len
,,int offset2(arrLiterals[0 i].cpMin) length2(arrLiterals[0 i].cpMax-arrLiterals[0 i].cpMin)
,,AdjustedOffset = offset2 + i
,,TextToRegexEscape.insert("\" AdjustedOffset)    
out "Post Regex-Escaping: %s" TextToRegexEscape

It is a bit inelegant in handling whether there is a literal in the first position and handling the growing offset.
Is there a way to kind of replace all at once (i.e. without having to worry about the progressively growing offsets?). I know this is probably requires some replacerx/findrx wizardry.

Also, even with this function working properly, RichEditHighlight keeps on failing if the text has for example *** in it, even though I have chosen the 4 (regex) flag.

Function RichEditHighlight
Code:
Copy      Help
;/
function hwndre ~findthis [flags] [color] [textcolor] ;;flags: 1 insens, 2 word, 4 regex, 128 bold, 0x100 version 1

;Highlights all occurences of findthis in a rich edit control. Works in any application.
;hwndre - rich edit control handle.
;findthis - text or regular expression to find.
;flags - combination of these values:
;;;1 - case insensitive
;;;2 - whole word
;;;4 - findthis is regular expression
;;;128 - make bold
;;;0x100 - rich edit control version 1. Version 1 class usually is RICHEDIT. For other versions, class is like RichEdit20A.
;color - highlight color. Not used for version 1 controls.
;textcolor - text color.

;EXAMPLE
;str rx="\bfind this\b"
;int n=RichEditHighlight(id(59648 "WordPad") rx 1|4|128 ColorFromRGB(255 255 128) ColorFromRGB(255 0 0))
;out "Found %i instances" n


def CFM_COLOR 0x40000000
def CFM_BACKCOLOR 0x4000000
def CFM_BOLD 0x1
def CFE_BOLD 0x1
def SCF_SELECTION 1
def EM_GETTEXTMODE (WM_USER + 90)
type CHARFORMATA cbSize dwMask dwEffects yHeight yOffset crTextColor !bCharSet !bPitchAndFamily !szFaceName[LF_FACESIZE]
type CHARFORMAT2A :CHARFORMATA'v1 @wWeight @sSpacing crBackColor lcid dwReserved @sStyle @wKerning !bUnderlineType !bAnimation !bRevAuthor
type CHARFORMATW cbSize dwMask dwEffects yHeight yOffset crTextColor !bCharSet !bPitchAndFamily @szFaceName[LF_FACESIZE]
type CHARFORMAT2W :CHARFORMATW'v1 @wWeight @sSpacing crBackColor lcid dwReserved @sStyle @wKerning !bUnderlineType !bAnimation !bRevAuthor
dll user32 #IsWindowUnicode hWnd

ARRAY(CHARRANGE) a; int i
str s.getwintext(hwndre); if(!s.len) ret
if(flags&0x100=0) s.findreplace("[]" "[10]")

if(flags&4)
,if(findrx(s findthis 0 flags&3|4|8|16 a)<0) ret
else
,a.create(1 0)
,rep
,,if(flags&2) i=findw(s findthis i 0 flags&1)
,,else i=find(s findthis i flags&1)
,,if(i<0) break
,,CHARRANGE& cr=a[0 a.redim(-1)]
,,cr.cpMin=i; i+findthis.len; cr.cpMax=i
,if(!a.len) ret

int ver1=flags&0x100 and !SendMessage(hwndre EM_GETTEXTMODE 0 0)
memset(share 0 sizeof(CHARFORMAT2W))
if(IsWindowUnicode(hwndre))
,CHARFORMAT2W& cfw=+share
,if(ver1) cfw.cbSize=sizeof(CHARFORMATW)
,else cfw.cbSize=sizeof(CHARFORMAT2W); if(color) cfw.dwMask|CFM_BACKCOLOR; cfw.crBackColor=color
,if(textcolor) cfw.dwMask|CFM_COLOR; cfw.crTextColor=textcolor
,if(flags&128) cfw.dwMask|CFM_BOLD; cfw.dwEffects=CFE_BOLD
else
,CHARFORMAT2A& cfa=+share
,if(ver1) cfa.cbSize=sizeof(CHARFORMATA)
,else cfa.cbSize=sizeof(CHARFORMAT2A); if(color) cfa.dwMask|CFM_BACKCOLOR; cfa.crBackColor=color
,if(textcolor) cfa.dwMask|CFM_COLOR; cfa.crTextColor=textcolor
,if(flags&128) cfa.dwMask|CFM_BOLD; cfa.dwEffects=CFE_BOLD

for i 0 a.len
,SendMessage hwndre EM_SETSEL a[0 i].cpMin a[0 i].cpMax
,SendMessage hwndre EM_SETCHARFORMAT SCF_SELECTION share(hwndre)

SendMessage hwndre EM_SETSEL 0 0
ret a.len

I am getting the error in RichEditHighlight (not the calling function):

Quote:Error (RT) in RichEditHighlight: error in pattern: nothing to repeat: ** sample text l

Any thoughts?
Thanks again,
S
#2
Why don't use \Q \E in regular expression? Isn't it the same as the function does?
#3
Whoops :oops: just saw that!!!!!
Thanks!!!
Stuart


Forum Jump:


Users browsing this thread: 1 Guest(s)