This blog is primarily setup to record the Digital Information Technologies and Architecture MSc module at CITY.

Wednesday 2 December 2009

DITA module 08

Information Retrieval

Many databases work with an indexing technique. A small example of such an indexing I've included in my web space, on this occasion it is only accessing two documents finding key repeated words relevant to the historic subject matter.

http://www.student.city.ac.uk/~abhj012/dita-8-exercise.html

But sites like Google build up massive tables of millions of key words with the most frequently/recently visited sites at the top of the list.
To find an HTML tutor site I typed in 'HTML TUTOR' and the one mentioned on my web space was in the first page of suggested sites. My choice of sites was one which went straight to html rather than through lists of different programming languages.
To get a Google image of the Dubai World Trade Centre, I only had to type in 'Dubai W' and Google-Image directly offered me the full name.
In many ways the search for Kohl, carried out for DITA module 10, sums up better the pros and cons of these web searches. While doing a search on the Waitrose website for the vegetable Kohl an over enthusiastic over stemmed search came up with two phonetics equivalents bread 'rolls' and vegetable 'oil'. Joking apart if this happened often you would choose one of the other supermarket sites, which at least admits it can't find what you are looking for.
Drifting away from searching for my site, the Amazon on seems to be strong on indexing removal of small words, for example if you type 'Information Architecture WWW' Morville and Rosenfeld's book is the first on the list, though it's full name is 'Information Architecture for the World Wide Web'.

No comments:

Post a Comment