Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
CSqlite search.
#1
I've avoided having to ask this question by trying to figure out as much as I could on my own, but I'm just not seeing the answer.

Some examples would be super helpful to help my education of working with a database.

Mainly, I need to search records...how do I get searches to return results even if there is a mispelling...or just partial. Is there any way to gauge the exactness of a match?

For Example:
The word in the database is "Gumption"
A user would be able to find that by searching "Gumpshun" "Gumpshen" or "Gumshen" then get the data for "Gumption"

Just some simple examples to point me in the right direction...I've done some research and I figure it has to do with "LIKE patterns" using wildcards and all that jazz.

I've never worked with any of this before, so some stuff to reverse engineer would be sweet.

Thanks,
Jimmy VIg
#2
Macro
Code:
Copy      Help
out
#compile CSqlite

str dbfile="$desktop$\test.db3"
str sql

;create database for testing
CSqlite db1.Open(dbfile)
db1.Exec("DROP TABLE table1")
db1.Exec("CREATE TABLE table1 (A,B)")
db1.Exec("INSERT INTO table1 VALUES ('one','two')")
db1.Exec("INSERT INTO table1 VALUES ('three','four')")
db1.Exec("INSERT INTO table1 VALUES ('te','five')")

;search
CSqlite db2.Open(dbfile)
ARRAY(str) ar; int r c
db2.Exec("SELECT * FROM table1 WHERE A LIKE 't%e'" ar) ;;in SQL, % is used instead of * (0 or more characters), _ instead of ? (single character)
for r 0 ar.len(2)
,out "row %i" r+1
,for c 0 ar.len(1)
,,out ar[c r]
#3
Thanks for the example...

Changed it a bit to make it a little more versitle...Do you see anything wrong with this idea or have any other thoughts?
Function Function8
Code:
Copy      Help
out
#compile CSqlite
str SearchWord="tre"
;;formats SearchWord "t%r%e%
ARRAY(str) a.create(SearchWord.len)
for(_i 0 a.len)
,a[_i].get(SearchWord _i 1);    _s.format("%s%s%%" _s a[_i])
SearchWord=_s

str sqlSearchWord.format("SELECT * FROM 'table1' WHERE A LIKE '%s'" SearchWord)

str dbfile="$desktop$\searchDB.db3"
str sql

;create database for testing
CSqlite db1.Open(dbfile)
db1.Exec("DROP TABLE table1")
db1.Exec("CREATE TABLE 'table1' (A,B)")
db1.Exec("INSERT INTO table1 VALUES ('one','two')")
db1.Exec("INSERT INTO table1 VALUES ('three','four')")
db1.Exec("INSERT INTO table1 VALUES ('te','five')")

;search
CSqlite db2.Open(dbfile)
ARRAY(str) ar; int r c
db2.Exec(sqlSearchWord ar) ;;in SQL, % is used instead of * (0 or more characters), _ instead of ? (single character)
for r 0 ar.len(2)
,out "row %i" r+1
,for c 0 ar.len(1)
,,out ar[c r]
#4
Just thinking a bit more...

The format should be more like "%t%r%e%" so that in the case there are letters before. This seem to work out a lot better!
With the way I am formatting it comes out : "%t%%r%%e%" (having the extra % character doesn't seem to make a difference though.

What if a SearchWord contains the wrong letters completely? How would something like that work?

Would it have something to do with [backets]?
#5
Don't know how with "Gumpshun" "Gumpshen" or "Gumshen". Maybe you want to find like "work" when given "work" or "working" or "worked" etc? Then use stem() function. Put words into database also stemmed.

For what you want to do I don't know how to.
#6
Given this some thought...

How about removing letters one by one
Entry desired to find: "Rock and Roll"
other entries: Blues, Jazz, easy listening, and Country
SearchWord is: Rockx
Rockx returns 0 results
ockx returns 0 results
Rokx returns 0 results
Rocx returns 0 results
Rock returns 1 result

The next step would be if there are 2, 3, or more letters that don't match.
This seems to go on and on...

What about a way to get a percentage match...or something?
like Rock and Roll matches is like a 50% of the letters of SearchWord...which would be the highest match
Blues would be a 0% match
Jazz would be a 0% match
easy listening would be 0% match
Country would be a 28% match

What about some sort of spelling corrector?
Code:
Copy      Help
http://www.norvig.com/spell-correct.html

Thanks again for everything!
Jimmy Vig
#7
It's by far from being perfect

Function Function21
Code:
Copy      Help
out
str SearchWord="Rock"
int Hit
str s=
;Rock and Roll
;Jazz
;Funk
;Folk
;Electronica
;Rap
;Country
;Indie
;Hip Hop
double HighestPercent
ARRAY(str) arr = s
for(int'i 0 arr.len)
,double dHit=0
,int iSearchWord = len(SearchWord)
,for(_i 0 iSearchWord 1)
,,_s.get(SearchWord _i 1)
,,double Hit1=findrx(arr[i] _s 0 4)>-1
,,dHit=dHit+Hit1
,double Percentage=dHit/iSearchWord
,if Percentage>=HighestPercent
,,HighestPercent=Percentage
,,int Index=i
out arr[Index]

It's a basic idea...any other suggestions?
#8
This example uses spelling checker with aspell.
See forum post http://www.quickmacros.com/forum/showthr...p?tid=2962 for spell checker instructions
Code:
Copy      Help
out
#compile CSqlite

str dbfile="$desktop$\searchDB.db3"
str sql

;create database for testing
CSqlite db1.Open(dbfile)
db1.Exec("DROP TABLE table1")
db1.Exec("CREATE TABLE 'table1' (A,B)")

db1.Exec("INSERT INTO table1 VALUES ('Rock and Roll','Data for Rock and Roll')")
db1.Exec("INSERT INTO table1 VALUES ('Rock','Data for Rock')")
db1.Exec("INSERT INTO table1 VALUES ('Jazz','Data for Jazz')")
db1.Exec("INSERT INTO table1 VALUES ('Funk','five')")
db1.Exec("INSERT INTO table1 VALUES ('Folk','five')")
db1.Exec("INSERT INTO table1 VALUES ('Electronica','five')")
db1.Exec("INSERT INTO table1 VALUES ('Rap','five')")
db1.Exec("INSERT INTO table1 VALUES ('Country','five')")
db1.Exec("INSERT INTO table1 VALUES ('Indie','five')")
db1.Exec("INSERT INTO table1 VALUES ('Hip Hop','five')")

CSqlite db2.Open(dbfile)

out
str w
w="Hip Hop"
int lw=len(w)
;out lw

#compile Caspell
Caspell k.Init
ARRAY(str) a
k.Suggest(w a)
str Suggestions=a
;out a
ARRAY(str) arr
int i nt
nt = tok(Suggestions arr -1 "[]")
for(i 0 nt)
,;out arr[i]
,;out "---"
,_s=""
,ARRAY(str) b.create(arr[i].len)
,for(_i 0 a.len)
,,a[_i].get(arr[i] _i 1)
,,_s.format("%s%s%%" _s a[_i])
,,_s.findreplace("'")
,,;out _s
,,str SearchWord=_s
,str sqlSearchWord.format("SELECT * FROM 'table1' WHERE A LIKE '%s'" SearchWord)
,ARRAY(str) ar; int r c
,db2.Exec(sqlSearchWord ar)
;,out sqlSearchWord
,int x=-1
,for r 0 ar.len(2)
,,for c 0 ar.len(1)
,,,out ar[c r]

This is starting to work pretty good.
I feel it might be a little to lenient when it comes to matches...but its a start.

Two problems I haven't figured out that maybe someone will see the solutions.
1. Duplicate matches...using wild cards the way I am it allows for suggestions in the spell checker to hit shorter versions of longer words which outs the entry for each match. I can't figure out how to flag an entry as being matched and to skip that on future suggestions.
2. Obviously too long match. Like Cop will match country in the example. The lengths are so obviously different.

I'm gonna code around a bit more and try to develop other bugs that will need solutions. If any one has other ideas of how to maximize searching of databases in any form, I would love to hear what you have to suggest!

I was thinking of taking words that have matches then evaluating them with a match rating program...then run the top word through the data base looking for an exact match...and so forth. If an exact match is not found...then "No Matches"

I'm really curious to see what other people might have to add to this. Could be very useful to a lot of programming...I think?

Thanks,
Jimmy Vig


Forum Jump:


Users browsing this thread: 1 Guest(s)