Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
WoW
Talk
English
Views
Read
Edit
History
More
Search
Navigation
Home
Random page
Help using wiki
Editions
for WoW
for WildStar
for Solar2D
Documentation
for WoW
for WildStar
Reference
WoW
⦁ FrameXML
⦁ AddOns
⦁ API
⦁ WoW Lua
WildStar
⦁ AddOns
⦁ API
⦁ WildStar Lua
Engine
Tools
What links here
Related changes
Special pages
Page information
Site
Recent Changes
Editing
WoW:HOWTO: Speed up string match lookups
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
When you have large number of patterns (dozens) to scan to find out which pattern is matching a given string, there's a few things you can do to speed up the job. If the patterns are hard coded, there is of course any number of ways that you can be clever. But if you do not know what the patterns look like beforehand, which is the case when you're trying to match input strings against patterns in [{{wdnlink|FrameXML/GlobalStrings.lua}} GlobalStrings.lua] using a formatstring-to-regex utility like [http://www.wowinterface.com/downloads/fileinfo.php?id=4809 BabbleLib's] Deformat() function. The approach below works by making lists of words used by patterns, and then looking at words in the input strings to determine which list(s) to look for matches in. Actually, the process is 2-pass. The first pass figures out the LEAST commonly used words, and then just uses those. * Note: The example contains a ''very'' simplistic "MyDeformatterFunc()" for converting "%s" to "(.*)". It will not work for other locales than english. Do not use it in the real world, please. <div style="max-width: 80em; margin-right: 2em; height: 30em; overflow: scroll;"> -- Functions that we want called for different string matches function RoughPokeFunc(v1,v2) print("RoughPokeFunc "..v1.." "..v2); end function SoftPokeFunc(v1,v2) print("SoftPokeFunc "..v1.." "..v2); end function SoftNudgeFunc(v1,v2) print("SoftNudgeFunc "..v1.." "..v2); end function ChickenFunc(v1,v2) print("ChickenFunc "..v1.." "..v2); end -- Strings to match mapped to functions that we want called MatchStrings = { ["%s roughly pokes %s"] = RoughPokeFunc, ["%s softly pokes %s"] = SoftPokeFunc, ["%s softly nudges %s"] = SoftNudgeFunc, ["%s gets nudged by %s and runs away screaming"] = ChickenFunc, } -- VERY simplistic deformatter function. -- You probably want a real deformatting library for this. function MyDeformatterFunc(str) return (string.gsub(str, "%%s", "(.*)")); end -- First run: count how many occurences there are of each word WordCounts = {} for str,func in MatchStrings do for word in string.gfind(str, "[^ ]+") do if(string.find(word, "^%%")) then -- ignore format strings else WordCounts[word] = (WordCounts[word] or 0) + 1; end end end -- Second run: for each string, pick the least common word and place string in that hash bucket MatchStringsHash = {} for str,func in MatchStrings do local bestword, num; for word in string.gfind(str, "[^ ]+") do if(string.find(word, "^%%")) then -- ignore format strings else if(not num or WordCounts[word] < num) then num = WordCounts[word]; bestword = word; end end end assert(bestword); if(not MatchStringsHash[bestword]) then MatchStringsHash[bestword] = {}; end MatchStringsHash[bestword][MyDeformatterFunc(str)] = func; end WordCounts = nil; -- now we don't need the counts anymore -- Dump our MatchStringsHash on-screen so we can see what it looks like! print "Examining hash buckets" print "----------------------" for word,strings in MatchStringsHash do print(" "..word..":"); for str,func in strings do print(" \""..str.."\""); end end -- Function that scans for matches and calls the resulting function function ScanForMatch(str) local bDone = false; local nCompares = 0; for word in string.gfind(str, "[^ ]+") do if(MatchStringsHash[word]) then for pattern,func in MatchStringsHash[word] do nCompares = nCompares + 1; local success,_,v1,v2,v3,v4 = string.find(str, pattern); if(success) then func(v1,v2,v3,v4); bDone=true; break; end end end if(bDone) then break; end end print(" \""..str.."\": "..nCompares.." string.finds actually executed\n"); end print(""); print("Executing!"); print("----------"); ScanForMatch("Alice roughly pokes Bob"); ScanForMatch("Bob softly pokes Charles"); ScanForMatch("Charles softly nudges Denise"); ScanForMatch("Denise gets nudged by Eve and runs away screaming"); ScanForMatch("This string does not exist"); </div> Running the above produces the following output: <div style="margin-right: 2em;"> Examining hash buckets ---------------------- roughly: "(.*) roughly pokes (.*)" nudges: "(.*) softly nudges (.*)" gets: "(.*) gets nudged by (.*) and runs away screaming" softly: "(.*) softly pokes (.*)" Executing! ---------- RoughPokeFunc Alice Bob "Alice roughly pokes Bob": 1 string.finds actually executed SoftPokeFunc Bob Charles "Bob softly pokes Charles": 1 string.finds actually executed SoftNudgeFunc Charles Denise "Charles softly nudges Denise": 2 string.finds actually executed ChickenFunc Denise Eve "Denise gets nudged by Eve and runs away screaming": 1 string.finds actually executed "This string does not exist": 0 string.finds actually executed </div> == Problems with this approach == There is no guarantee as to which order the string matches will be attempted. For example, assume these two patterns: #"%s hits %s." #"%s hits %s hard." Now, given the input string "Alice hits Bob.", only #1 will match, and all is good. But with the input string "Alice hits Bob hard.", there is NO guarantee which string will match. You can get #1 with the arguments "Alice", "Bob hard". Or you can get #2 with the arguments "Alice", "Bob". [[Category:HOWTOs|Speed up string match lookups]]
Summary:
Please note that all contributions to AddOn Studio are considered to be released under the Creative Commons Attribution-NonCommercial-ShareAlike (see
AddOn Studio Wiki:Copyrights
for details).
Submissions must be written by you, or copied from a public domain or similar free resource (see
AddOn Studio Wiki:Copyrights
for details).
Cancel
Editing help
(opens in new window)
Templates used on this page:
Template:Wdnlink
(
edit
)
Template:Wowproglink
(
edit
)