Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Regular Expression Help
#1
Hi,
I know I should be able to answer this myself, but I can't seem to get the hange of regular expression syntax.

What I want to do is fairly simple: extract the variable text between two regular bits of text:

e.g.

Quote:Name: John Doe
MRN: 12345678
exam: xray

I want to be able to extract John Doe and place in a string e.g. ptname
extract MRN and place in str mrn.
These are separated by lines.

It seems like I should be able to grab all the characters between the two words e.g. Name and MRN and then trim the extra off, but I can't get it to work.
Any suggestions?

Thanks,
Stuart
#2
Example

Code:
Copy      Help
str s=
;Name: John Doe
;;;MRN: 12345678
;;;exam: xray


str s2 s3
if(findrx(s "(?m)^ *Name: *([^[]]+)" 0 0 s2 1)<0) ret
out s2
if(findrx(s "(?m)^ *MRN: *([^[]]+)" 0 0 s3 1)<0) ret
out s3

Explanation

(?m) - sets ^ and $ to match beginning and end of a line instead of whole string.
* - 0 or more spaces.
() - store the enclosed part into s2.
[^[]]+ - 1 or more any characters except new line characters. Note: [] is QM escape sequence, not a part of regex syntax. Using regex syntax it would be [^\r\n]+.
#3
Thank you,

This is one of the last few things holding me back before a wide deployment. If this works then I will be needing to send you a few(man) more registration checks very soon!
Stuart
#4
Hi Gintaras,

From the string above, some of the material I need to extract spans multiple lines:

eg



Quote:Name: John Doe
MRN: 12345678
exam: xray
BeginShortReport
This is the text of the report
This is line 2 of the multiline report

This is the 4th line of the report which comes after an empty line, etc
EndShortReport

I would like to get the text between BeginShortReport and EndShortReport.
I have tried the following but with no luck:

Code:
Copy      Help
str longreport shortreport
findrx(longreport "(?<=BeginShortReport) (?=EndShortReport)" 0 8 shortreport2 1)

I know I must enter something that says get any character between BeginShortReport and EndShortReport but I can't figure it out.

Thanks for any help,

Stuart
#5
Use option (?s) whitch tells . to match new line characters too. Also, if you want to get whole match (not an enclosed part), use 0, not 1.

Code:
Copy      Help
findrx(longreport "(?s)(?<=BeginShortReport).+(?=EndShortReport)" 0 8 shortreport 0)


Forum Jump:


Users browsing this thread: 1 Guest(s)