Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Is IstringMap interface the best to find a string in a list?
#1
Hi Gintaras, Hi all,

I must test a string variable to a huge bunch of other strings. (>50)

StrSel is what i used, but it's not convenient when there is many many strings to test.
A loop using foreach is what i used , but seems too long.
IStringmap is also a possibility from examples in QM I use now.

My best bet would be something like StrSel, but the tested strings would be in an array filled from a text file.

How to achieve something like:

str test2.getfile("$desktop$\test.txt")
_s="somethingtotest"
int result=SelStr(17 _s test2)
if(result!=0)
action...

Does it exit?

What can you give as advice to best handle that?

thanks
#2
IStringMap is exactly for this purpose.

Macro Macro1950
Code:
Copy      Help
str test2.getfile("$desktop$\test.txt")
IStringMap m=CreateStringMap(1|2)
m.AddList(test2)

_s="somethingtotest"
if m.Get(_s)
,out "found"
#3
QM help says:

"If you want to add simple list of strings (keys without values), you can use "[]" as sep to avoid breaking strings into keys and values."

If I add the above text in test.txt and use sep="[]"

Macro Macro289
Code:
Copy      Help
str test2.getfile("$desktop$\test.txt")
IStringMap m=CreateStringMap(1|2)

m.AddList(test2 "[]")

_s="you"
if m.Get(_s)
,out "found"
else out "not found"

_s="you" => not found

Please, how to find any word in a huge text?
#4
The file must be multiline

If
you
want
...

Do you need string index in the list?
What separator is used in your list?
#5
I misunderstood the use of IstringMap.

Anyway, thanks for your prompt reply. Much appreciated.
#6
Ok, i was then right to use IStringMap and will stick to it.

Anyway, i'd like to use sel to behave on matching string.

How to get index of found string in IStringMap? did not find
in QM help...
#7
IStringMap sorts strings, because searching in a sorted list is much faster. Indices are lost.

What is list format?

string1
string2
...

or

string1, value1
string2, value2
...

or

string1 string2...

or...
#8
the first one, only a list of words...

one
two
three
four

etc etc
#9
foreach is fast.

Function SelStrInList
Code:
Copy      Help
;/
function# $sList $sFind [flags] ;;flags: 1 case insens

;Returns 1-based index of string in a list, or 0 if not found.

;sList - multiline list of single-line strings.
;sFind - string to find.


int i(1) ins(flags&1)
str s
foreach s sList
,if(!StrCompare(s sFind ins)) ret i
,i+1
#10
actually, i tried that way, but ditched it for IStringMap because I
thought
it would be faster/lightweight.

Eventually, the list to be matched will contain up to 800/1000 strings.

Which method is most suitable for searching in then?
#11
foreach average time with 1000 strings is 100 microseconds with 1.6 G CPU.
IStringMap is faster (if created once, searched many times, and if number of strings is big enough), but cannot be used if need string index.
#12
Ok, my i7 laptop with 8 Go RAM will do fine then.

I'll use foreach if index is needed,
iStringMap if only matching is needed without index.

Seems fair to you?
#13
I was thinking about creating a global array at QM launch, so it would be available for all searches .

For index in IStringMap, i was thinking about

using a list like (in test.txt)

"wordone" "1"
"wordtwo" "2"
wordthree" "3"
and so on.

Not much work.

Would this code work, and/or does it make sense?

str test2.getfile("$desktop$\test.txt")
IStringMap m=CreateStringMap(1|2)
m.AddList(test2)

_s="somethingtotest"
int index=m.Get(_s)
out index
Sel index
case 1: action1
case 2: action2
case else end
Right code?
#14
Macro Macro1996
Code:
Copy      Help
out

;create list of random strings for testing
str sList sFind
rep 1000
,sFind.RandomString(10 10 "a-z")
,sList.addline(sFind)
;out sList

;create map. Add indices now. Time 1100 us.
IStringMap m=CreateStringMap(1|2)
int i=1; str s
foreach(s sList) m.IntAdd(s i); i+1

;find string and get its index. Time 2 us.
if(!m.IntGet(sFind i)) i=0
out i
#15
Yeap, must better, as usual. Thanks Gintaras
#16
OK, cleaned some codes from advices given in last questions.

Now arise a new question:

I have _s containing some paths
_s=
c:\program files\myexe51.exe
c:\program files\myexe2.exe
c:\program files\myexe3.exe
c:\program files\myexe4.exe
c:\program files\myexe5.exe


and i have an array or another string variable of words

ARRAY(str) a1 or str s1="myexe1 mystuff anotherone somethingelse"

All tricks (IStringmaps, SelStr, StrCompare) will work on words alone, not with a find function.

I'd like to make another string/array with words in s1 or a1 matched in _s, and one with unmatched.

I tried many ways, found one with an intermediate array which works, but sure there must be simpler way.

What I did
foreach str'r _s
foreach str'rr s1
if(find(rr r)>-1) doaction
but i can't get the place to put unmatched words.

Thanks for help.
#17
Macro Macro1990
Code:
Copy      Help
out

str s=
;c:\program files\myexe1.exe
;c:\program files\myexe2.exe
;c:\program files\myexe3.exe
;c:\program files\myexe4.exe
;c:\program files\anotherone.exe

str sw=
;myexe1
;mystuff
;anotherone
;somethingelse

ARRAY(str) am au

foreach _s sw
,if(findw(s _s)<0) au[]=_s
,else am[]=_s

out "<><Z 0x80c080>Matched</Z>"
out am
out "<><Z 0x80c080>Unmatched</Z>"
out au
#18
Damned, I knew it would be simple, but I did not try findw on purpose
as Help file says "find whole word" and I searched for substring. Fooled again.

The ouput color trick is wonderful, will use it extensively....

Thanks Gintaras
#19
Would it be possible to sort s too in your example, in matched/unmatched form?
#20
Macro Macro1989
Code:
Copy      Help
out

str s=
;c:\program files\myexe1.exe
;c:\program files\myexe2.exe
;c:\program files\myexe3.exe
;c:\program files\myexe4.exe
;c:\program files\anotherone.exe

str sw=
;myexe1
;mystuff
;anotherone
;somethingelse

ARRAY(str) am au AM AU

foreach _s sw
,if(findw(s _s)<0) au[]=_s
,else am[]=_s

foreach _s s
,for(_i 0 am.len) if(findw(_s am[_i])>=0) break
,if(_i<am.len) AM[]=_s; else AU[]=_s

out "<><Z 0x80c080>Matched</Z>"
out am
out "<><Z 0x80c080>Unmatched</Z>"
out au
out "<><Z 0xc0c0>Matched</Z>"
out AM
out "<><Z 0xc0c0>Unmatched</Z>"
out AU
#21
Well, your code works of course, but not for me.

In fact, the generic strings I gave as examples are not formatted the way I use them, so I think the
problem come from there.

I'll try to make it on my own, and will post tomorrow a new thread if I can't, with real code.

I now sleep on that...
#22
OK, back from work.

Indeed, i managed to adapt the code to work for me.

I misused tok function, as I used something like
"c:\thepathto\myoprogram"*"somearguments" form as parsing string, but it screwed up until i use
the 4 flag in tok function and remove the * and used space instead.

Thread close.


Forum Jump:


Users browsing this thread: 1 Guest(s)