My personal blog, for life outside work

Wednesday 16 December 2009

DITA module 11

References and URLs

References
General Reference
P. Morville & L. Rosenfeld (2007) Information Architecture, for the World Wide Web, 3rd edition. O'Reilly Media Inc.

Posting 1
oxford - http://www.askoxford.com/concise_oed/technology?view=uk (accessed Sept. 2009)
microsoft - http://office.microsoft.com/en-gb/clipart/HP030900871033.aspx (accessed Sept. 2009)
dictionary - http://dictionary.reference.com/browse/Information Information Technology, ref no. 2 (part of the Ask.com service) (accessed Dec. 2009)

Posting 4
Full location for image - http://images.google.co.uk/imgres?imgurl=http://www.asiagrace.com/photos/h/valley-small.jpg
(accessed Jan. 2010)

Posting 5
figure 1. http://code.google.com/apis/kml/documentation/kmlreference.html#geometry (accessed Dec. 2009)
figure 2. http://code.google.com/apis/kml/documentation/kmlreference.html#region (accessed Dec. 2009)

Posting 6
http://freespace.virgin.net/sizzling.jalfrezi/frames/fstyles.htm compatibility tables / danger list (accessed Jan.2010)

Posting 10
Galip Aydin (2007) http://grids.ucs.indiana.edu/ptliupages/publications/GalipAydin-Thesis.pdf (accessed Jan. 2010)

URLs
Blog - http://www.annie-lizie.blogspot.com/
Website - http://www.student.city.ac.uk/~abhj012/index.html
Javascript - http://www.student.city.ac.uk/~abhj012/java-bbc-text.txt

DITA module 10

Information Architecture

This illustration taken from Galip Aydin's (2007) Phd thesis really sums up how Information Architecture for Geographical Information works.

There is incoming data taken by it's nature from many different databases, this then needs to be merged to produce a client suitable web based product.

This merging is undertaken adding HTML scripts to the incoming data so that only the required parts are retrieved from each database. Also the use of SQL to access what is needed from Oracle and other SQL accessed databases.

In the past only small projects were undertaken where almost manual input of data was possible. But, now with ever growing volumes of data being available and required, Information Architecture is fundamental in the progression of the GIS industry. Standardisation of GML and KML will help to produce a more uniform and easily manipulated product.

Wednesday 2 December 2009

DITA module 09

Applications Development

JavaScript like Flash and SVG are all client-side approaches for dealing with information which is sent en-mass to the user, rather than being processed centrally and only the result viewed in a client browser. The adding of a java script to the HTML, containing only a few questions and options can allow an end user to get quickly to the part of a website which is of interest to them.
I built the script below in small sections which I checked worked and then knitted them together. The biggest problem I had was getting all the 'if' and 'else' selection statements to close correctly. The script is accessing the BBC News website and then directing you to relevant subject pages of your choice.
The script has four layers, which I have shown in this diagram, the first is news in general, the second the options for either regional news or sport, the third being an option within either the new or sports sections and the final layer is the relevant web pages for a news-region or sport-type choice.

I have added text lines within the java script to describe what it is doing.

http://www.student.city.ac.uk/~abhj012/java-bbc-text.txt

DITA module 08

Information Retrieval

Many databases work with an indexing technique. A small example of such an indexing I've included in my web space, on this occasion it is only accessing two documents finding key repeated words relevant to the historic subject matter.

http://www.student.city.ac.uk/~abhj012/dita-8-exercise.html

But sites like Google build up massive tables of millions of key words with the most frequently/recently visited sites at the top of the list.
To find an HTML tutor site I typed in 'HTML TUTOR' and the one mentioned on my web space was in the first page of suggested sites. My choice of sites was one which went straight to html rather than through lists of different programming languages.
To get a Google image of the Dubai World Trade Centre, I only had to type in 'Dubai W' and Google-Image directly offered me the full name.
In many ways the search for Kohl, carried out for DITA module 10, sums up better the pros and cons of these web searches. While doing a search on the Waitrose website for the vegetable Kohl an over enthusiastic over stemmed search came up with two phonetics equivalents bread 'rolls' and vegetable 'oil'. Joking apart if this happened often you would choose one of the other supermarket sites, which at least admits it can't find what you are looking for.
Drifting away from searching for my site, the Amazon on seems to be strong on indexing removal of small words, for example if you type 'Information Architecture WWW' Morville and Rosenfeld's book is the first on the list, though it's full name is 'Information Architecture for the World Wide Web'.

DITA module 07

Databases

The database approach to holding data allows data to be stored centrally once, with access being given where needed.
This central approach unlike the file one, also means that if a single piece of information needs updating ie a person's address it need only be done the once. Which in itself reduces not only the time for the update, but also possible errors between the databases.
With the database approach back-up and recovery is also undertaken centrally.
Relational databases are sets of tables which have a unique identifier called a 'primary key' for each entry.
The biblio library database can be accessed using Structured Query Language (SQL).
The display below shows an sql accessing just the 'titles' table so as to print out the ISBN number and title of a specific book.

The relational tables contain the total information held on a book split into smaller, quicker accessed bite sized pieces.
The following page, also on my web space, contains 10 additional SQL queries of multiple tables within the biblio book database.
http://www.student.city.ac.uk/~abhj012/lesson_7_all_sql.mht

DITA module 06

CSS

Style as an html tag can sit within each individual web page. But, cascading style sheets (CSS) allow for a standard format to all or some of your web pages. This not only gives a clearer message to the end user that these are all your sites pages, but it also allows for the HTML scripts to be much less cluttered and easier to read or update.

Within my web space I have used two different basic styles, neither were hard to generate and the second in particular has left the HTML script page it is applied too, much easier to read, and if applied when originally generated would have made the page build much quicker.

http://www.student.city.ac.uk/~abhj012/styleas1.css
used for index, first and GIS pages, sets the size and colour of the body text and the background colour for the screens.

http://www.student.city.ac.uk/~abhj012/styleas2.css
used for GI-Greenwich page only, this also contains table style information and two forms of headers.

The CSS also allows for just a single change if for example you change the header font colour for all your pages, rather than having to update each individual line of header tags or each page's embedded style tags.

The only thing really against CSS is that different browsers use the styling requests in slightly different ways. But these are well documented in sites like the following

http://freespace.virgin.net/sizzling.jalfrezi/frames/fstyles.htm compatability tables, danger list.

Such that, any dodgy tags can be kept at arms length.

Monday 26 October 2009

DITA module 05

XML languages

My initial efforts with XML were to create a DTD and XML using the downloadable XML Marker from http://www.symbolclick.com/ This was to generate a database of Seismic Surveys and Lines.

I do attach it here, but afterwards realised I wasn't using a standard DTD.
http://www.student.city.ac.uk/~abhj012/dita-5-xml-example.mht

The GML/KML languages for GIS have been well developed by the Ordnance Survey and more recently by Google.

The language has been built so as to allow for the entering of geographical features, their location and annotation.

I found the Google website very helpful in it's explanation of how for example the geometry is recorded.

http://code.google.com/apis/kml/documentation/kmlreference.html

figure 1.

This allows for the entry of points, lines and polygons (as we had been discussing in GIS).

Likewise region is given by latitude, longitude and altitude and is then fixed to a given

number of pixels.

figure 2.

Each element is given its relative minimum and maximum values which will give

the boundaries for the region. For example a region could have a min of 5 degrees west and

a max of 3 degrees east with a min of 55 degees north and a max of 59 degrees north.

DITA module 04

Images and Graphics

Within my web space I have a page of photos from a trip to Greenwich, all these files are JPEG format and have individual tags.

http://www.student.city.ac.uk/~abhj012/GI_Greenwich.html

Image formats differ greatly in the resulting image size, visual quality and compression. An image needs to be large enough to give enough detail, but not so large that it causes a site to open slowly. The quality of the image is very important for GIS but it is possible to loose some detail for the sake of space. For example, a road map can have one colour for the background as the need for elevation information isn't there, but a walking map needs elevation contours and colour differences to show if the area is flat or a steep gorge. Sites like Google with it's map and satellite images need to use the minimum size of files so that they don't slow down the users use of the site, but give enough detail. For the maps there are only a small number of colours used, while for the satellite images there are many more. As you zoom out on a image the number of colours also reduces.

To the right is a picture of a valley, taken from www.asiafrace.com/photos uploaded from Google Images.
The picture is in Jpeg format with a size of 84 KB
The image is clear and all colours are well defined.
The image used JPG compression, where you have the choice of the size and quality of the image stored.

This is the same picture saved in PNG format and is 135KB
The PNG format uses ZIP compression which is lossless.
From the saved JPEG image to the PNG one you can see a
slight difference in the sampling with a smoother image, but one with less contrast.

This picture is in GIF format with a size of 38KB
This file is compressed using lossless LZW.
The result is a smaller file but a grainy image.

The final image is 256colour bitmap.
The size of the file is 135KB, but the quality isn't good due to the reduced number of colours. The format is uncompressed and stored pixel by pixel.

DITA module 03

Internet and the WWW

The World Wide Web (WWW) is a service which is able to work because of the infrastructure of the Internet, it's history goes back to the early 1990's and was developed for the sharing of information between Universities.

Many establishments have an Intranet, which is only accessible from within the organisation or remotely with restricted access. The Internet is the part of the organisations computer system, often a single server, whose information is accessible to the public.

The use of either intra or inter nets becomes very relevant when producing information for a website which is going to be made public. From within the intranet system it is possible to drag images/files from many locations, which will still be seen. But if you were to publish the site as is, these links would show as broken and not be seen by the public. All the files need to be copied to the public site server or the pages which include the scattered images need to be saved as single MHT files and then be accessed via anchor tags.

I have generated the following web site in Hypertext Mark-up Language (HTML), some editing was undertaken in Microsoft Notepad and some in Unix using vi. It includes further links to first.html, GIS.html and GI_Greenwich.html.

http://www.student.city.ac.uk/~abhj012/index.html

I used the html tutor material, mentioned on my site, to add tables such that I could generate the four columns of information and the two pictures located side by side.

The moving of the different pages for the site was undertaken using telnet (internally) and Secure Shell FTP, SSH (externally).

Monday 5 October 2009

DITA module 02

Digital Representation and Organisation, bits, binary, files and documents. Text and HTML

The lecture showed the history of the computer binary system we use today, covering binary, ASCII, Unicode and Metadata. The excercise in the class took this information on to the practicalities of how these formats are seen by the end user, using different Microsoft programs.

Data formats as an end user.

The binary format was generated to allow for a switch to be in either in the (0) off position or the on position (1).

ASCII - this is seven bit, allowing for 128 different Latin characters to be stored in binary. This format was produced prior to 8-bit computers. A version of this is used by Microsoft Notebook.

Unicode - a more modern version which allows for 107,000 international characters.

The tag below shows the work undertaken in the class exercise. It was built in word but saved as a single web page. I hope that it shows that though embedded within the html format programming an original text message and location of an imported file can still be seen.

http://www.student.city.ac.uk/~abhj012/weather-bells-whistles-now.mht

Definitions gained during the lecture

Metadata - this is the information that the computer needs to process the data it is given. The tags control how the data is split under different headings, ie Bowden D. will mean Bowden is stored as the name of an author.

Data - this usually refers to unprocessed text

Information - this is processed text

File - this is usually a single piece of information

Document - this can contain more than one file, often of different file types.

Extra information found out re Blog posting building.

To copy and paste within the posting, as well as using the Edit menu of the web browser you can use the keyboard ctrl commands.

Conclusion

The conclusion is that, the computer is only a none thinking group of parts, which is able to process text into information only when they are told how by man ..... no artificial intellegence (yet).

Monday 28 September 2009

DITA module 01

Web logs and Introduction,

DITA by definitions,

Digital - This is data in a numeric form which can then be processed by a computer.

Information - 'Knowledge gained through study, communication, research, instruction etc.' is how information is defined in dictionary.com. (ref: dictionary)

Technology - This is defined well in the Oxford English Dictionary as 'the application of scientific knowledge for practical purposes'. (ref: oxford)

Architecture - This is the space which allows for the use of technology to pass on information, whether in the real world ie the university building or a blog on the web.

This introductory module primarily shows us there is more than one way to skin a cat. Different options will work best in different situations and on different platforms.

Choice of Weblog,

The following were my initial reactions to three versions of web logs;
Blogger - This appeared to have the most professional looking web page and is well known. The instructions for setup being clear and the setup page not being over complicated, not trying to sell itself.
Live journal - This also looked professional, but the initial page not as clearly arranged as blogger.
Dairyland - This looked more like a work in progress. The welcome page, as well as giving you the chance to set up an account, was also trying to sell the site with references to very specific groups you could belong to. This made it look messy.

Setup and user experience with Blogger.com

The setup was easy, but when it came to altering the layout, I managed to keep an unsaved version open, which resulted in me having to remove the cookies and history.
I went for a layout with a white background, easier for printing. I checked whether Microsoft Clipart was copyrighted, decided it was and hence mentioned it below the picture. (ref: microsoft) All posting are given the same label DITA-2009.
I found that editing and adding new blogs was easy.
The blog site allows the recording of data in real time and sharing of the information with others. It also allows for the easy removing of unwanted postings.