Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
RegEx - can't catch pattern
#1
Hi guys,

i am reading a file and put it in a string.
In this file/string there are lines that i need to catch. They have this pattern:
Quote:16.04.21;5;AB;123456;Client;2021;12345;Done;12345;Lorem ipsum dolor;0:12
;;blablabla;;;;;;;;
;;blablabla;;;;;;;;
;;blablabla;;;;;;;;
04.03.22;5;CD;123456;Client;2022;12345;Done;12345;Lorem ipsum dolor;0:10
;;blablabla;;;;;;;;
09.03.22;328;EF;332200;Client;11.02.22;12345;Done;12345;Lorem ipsum dolor.;0:10
;;blablabla;;;;;;;;
18.01.22;311;GH;331773;Client.;10.2022;12345;Done;12345;Lorem ipsum dolor;1:01
03.05.22;605;IJ;331230;Client.;01.2022;12345;Done;12345;Lorem ipsum dolor;0:16
;;blablabla;;;;;;;;
;;blablabla;;;;;;;;

This are the times for the tasks done for the different projects. How long did we need for a clients projekt (each single task)? They need to be converted into another format.
Separated by Semikolon. This textfile to csv. 
First you see the date in format dd.mm.yy of the task. Then a few informations. At the end of the line you find the time in h:mm but then it goes strange what the exporting program does: If our employees added a text to their tasks it makes a line break (\r) and splits this text up. How many rows i get depends on the length of the text. It always starts with two semikolons and each row ends with 8 semikolons.
So i need this text in a column in the csv. Between two semikolons and no line breaks or 500 useless semikolons...
My idea is to work with that in three steps:
1) Catch the first line and put a pipe symbol at the end
2) Catch the end of all the "blablabla" lines: If it has 8 semikolons followed by a line break \r, and the date => It is the end. Take that Semikolons away and put another pipe on it.
3) Take all between a pipe-symbol and a pipe-symbol and delete all linebreaks and semikolons if they are more than one in a row.

So i made a pattern and use it in the code, testet it in Regex Buddy but it doesn't work in QM2. I don't know why?
Do you see what i am missing here?
 
Quote:;; What i mean by "first line":
16.04.21;5;AB;123456;Client;2021;12345;Done;12345;Lorem ipsum dolor;0:12

;; How it should look like at the end:
16.04.21;5;AB;123456;Client;2021;12345;Done;12345;Lorem ipsum dolor;0:12;blablablablablablablablabla;
04.03.22;5;CD;123456;Client;2022;12345;Done;12345;Lorem ipsum dolor;0:10;blablabla;
09.03.22;328;EF;332200;Client;11.02.22;12345;Done;12345;Lorem ipsum dolor.;0:10;blablabla;
18.01.22;311;GH;331773;Client.;10.2022;12345;Done;12345;Lorem ipsum dolor;1:01;
03.05.22;605;IJ;331230;Client.;01.2022;12345;Done;12345;Lorem ipsum dolor;0:16;blablablablablabla;

 
Code:
Copy      Help
str stuff
stuff.getfile("G:\offene Zeiten.txt")

stuff.replacerx("(\b(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.][0-9]{2}\b;.*\d{6};.*;.*\d{5};.*;\d*:\d{2}|\d{2})\s+" "$1\|") ;; catches first line

stuff.replacerx(";{8}(\r[0-3][0-9]\.[0-1][0-9]\.[1-2][0-9];)" "\|$1" 8) ;; <= trying to catch last line but does not work!

stuff.replacerx("(?m)^\s+") ;; deletes empty lines

out stuff

Thanks for any help and hints

Achim
#2
In QM strings [12], [01] etc are escape sequences. Need an escape sequence for the [ character.

To get correct string can be used the Text dialog.

Another way - raw string:
Code:
Copy      Help
str rx1=
;(\b(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.][0-9]{2}\b;.*\d{6};.*;.*\d{5};.*;\d*:\d{2}|\d{2})\s+
stuff.replacerx(rx1 ...
#3
No, that works correctly. It is the next line that fails (i changed the code slightly)
Quote:
Code:
Copy      Help
stuff.replacerx(";{8}(\r\n[0-3][0-9]\.[0-1][0-9]\.[1-2][0-9];)" "$1@OUTRO@$2")

What causes the issue is the leading ";" in the string. But that ";" has to be there because in the file the pattern always starts with 8 semikolons: ";;;;;;;;".
#4
Maybe stuff is incorrect when calling it (after the first replacerx).

The second regex is working, but I don't know whether it is correct. I just tested this code, and it prints 3.

int n=stuff.replacerx(";{8}(\r\n[0-3][0-9]\.[0-1][0-9]\.[1-2][0-9];)" "$1@OUTRO@$2")
out n ;;3


Forum Jump:


Users browsing this thread: 1 Guest(s)