09-03-2014, 07:40 PM
Hi
I think I've found a bug in my understanding and was hoping for an upgrade
I'm using findrx to find text in the window of another application and then uses windows messages like EM_SETSEL and outp to update the text in the other application; it seems to work fine in ANSI mode but I think I want it to run in Unicode mode (since this seems more general). Everything works fine until certain characters appear in the text (like an n with a tilde - ñ). When this happens the selections are off by the number of 'special' characters that occur in the text (it is as if each of these characters counts as two). If the encodings were inconsistent between Unicode and ANSI this would make sense - maybe all I need to know is how to make these consistent.
Here is a simple function that displays the text matching a regular expression from another application:
Function TestReplaceUnicodeAnsi
and this function is invoked like this:
Macro RunTestReplaceUnicodeAnsi
If the original text contains this string:
then the output in the console when Unicode is not selected in QM is:
If I run the same code after setting Unicode on in Tools->Options - I get
works just fine in the non-Unicode case - but if Unicode is enabled in Tools->Options the offsets are off.
returns true so I am assuming that I should be using Unicode.
My confused question: Is there a way to generate the offsets using Unicode such that when I do a selection for pasting or highlighting that the offsets are consistent? Or do I have a bad mental model for how this works? Thanks.
I think I've found a bug in my understanding and was hoping for an upgrade
I'm using findrx to find text in the window of another application and then uses windows messages like EM_SETSEL and outp to update the text in the other application; it seems to work fine in ANSI mode but I think I want it to run in Unicode mode (since this seems more general). Everything works fine until certain characters appear in the text (like an n with a tilde - ñ). When this happens the selections are off by the number of 'special' characters that occur in the text (it is as if each of these characters counts as two). If the encodings were inconsistent between Unicode and ANSI this would make sense - maybe all I need to know is how to make these consistent.
Here is a simple function that displays the text matching a regular expression from another application:
Function TestReplaceUnicodeAnsi
function'int int'hwndre str'findthis
str windowContents.getwintext(hwndre); if(!windowContents.len) ret -3
windowContents.findreplace("[]" "[10]")
str findString = findthis
ARRAY(CHARRANGE) a
int flag = 4
int isFound = findrx(windowContents findString 0 flag&3|4|8|32 a)
out F"isFound = {isFound}"
if (isFound)
,for _i 0 a.len
,,out F"a[{_i}] min ={a[_i 0].cpMin}, max={a[_i 0].cpMax}"
,,int nc = a[_i 0].cpMax - a[_i 0].cpMin
,,str substr.get(windowContents a[_i 0].cpMin nc)
,,out F"Selected string is '{substr}'"
and this function is invoked like this:
Macro RunTestReplaceUnicodeAnsi
int w=win("PowerScribe 360 | Reporting" "WindowsForms10.Window.8.app.0.3ce0bb8_r13_ad1")
int c=child("" "*.RICHEDIT50W.*" w 0x0 "wfName=rtbReport") ;;editable text
str findThis = "(?m)(?<=IMPRESSION:)\S?\s{{0,2}\w?"
int testResult = TestReplaceUnicodeAnsi(c findThis)
out F"TestResult is {testResult}"
If the original text contains this string:
Quote:ññññ
IMPRESSION:
Nodule in the left lung
then the output in the console when Unicode is not selected in QM is:
Quote:isFound = 1which is what I want - the first character after "IMPRESSION:" and some whitespace.
a[0] min =1209, max=1211
Selected string is '
N'
TestResult is 0
If I run the same code after setting Unicode on in Tools->Options - I get
Quote:Unicode:which is also fine - I get exactly the selected string I want. Since there are four 'ñ' characters the offsets are different by 4. But when I want to update the text in the other window (say I want to highlight it) it is misaligned:
isFound = 1
a[0] min =1213, max=1215
Selected string is '
N'
TestResult is 0
works just fine in the non-Unicode case - but if Unicode is enabled in Tools->Options the offsets are off.
returns true so I am assuming that I should be using Unicode.
My confused question: Is there a way to generate the offsets using Unicode such that when I do a selection for pasting or highlighting that the offsets are consistent? Or do I have a bad mental model for how this works? Thanks.