We have OCRed thousands of pages of newspaper articles. The newspaper, issue, date, page number and OCRed text of each page has been put into a mySQL database.
We now want to build a Google-like search engine in PHP to find the pages given a query. It's got to be fast, and take no more than a second for any search.
How should we do it?
You can also try out SphinxSearch. Craigslist uses sphinx and it can connect to both mysql and postgresql.