Friday, July 18, 2008

I have earlier described pattern matching, and "smart" information retrieval by first looking at broad groupings of information to create a set, then search the resultant set with a finer granularity in search terms.

If we use the neo cortex processing as an example, lower levels of information is detected by our sensory organs and processed at a lower level, and a fraction of this information is actually processed in a higher level organ. If we were to process information this way, we could do the following: For each search term, key words being the lowest, we could assign probability of this documents relevance, and then search the resultant set with bigrams. This result set would then be searched with trigrams. These resultant set would then be assigned with a probability of relevance. The finest search using complex patterns would only be used on the final set.

For each of these searches, a registry (data base) would then serve as the index of this information, and it should correlate to a taxonomy. This taxonomy would then be used to create meta data that would be assigned the document. With this, the opportunity to search for hidden patterns would be possible via data mining techniques.

No comments: