MMDB-3 J. Teuhola 2012 39
- 3. Text and document databases
Normal databases: formatted records;
document databases: free-form or semi-structured data (e.g. XML).
Application areas:
- Office automation, document archives
- Digital libraries
- Electronic dictionaries /encyclopedias
- Electronic newspapers
- Source program libraries
- Automated law and patent offices
What is a ‘document’? E.g. a book, chapter, paragraph, article,
letter, web page, source program, etc.
General problem setting: Searching documents by contents;
- ften called also associative search.
Usually based on keywords or terms occurring in documents. Search terms may be combined with Boolean connectives