Professional Documents
Culture Documents
3. Organising Data
Manuel Organising Methods
Manuel information systems have been in use for millennia. Many methods have been used for organising and processing information such as the telephone book, an address book, appointment books and recipe books. Manual systems are still used commonly because they do not require a computer, are highly portable and no special skills or training needed. But their disadvantages are that they are not very flexible with the layout, access is usually by one method only and only one person can access it at a time. They are usually slow when trying to locate a particular data item.
Hypermedia hypertext and hypermedia are unstructured. They dont have set hierarchical structures like records, fields and attributes. But they have relationships between and within files. Hypermedia is the electronic linking of media, providing an easy way for users to navigate their way between information stores in different sections of the one document or in different documents in different locations. Links are built around objects that appear on the screen. They allow you to move your current display to another. The World Wide Web is a means of accessing information over the medium of the Internet. It is the application of hypermedia and hypertext that has helped to make WWW user friendly. The address or location of each resource on the Web is known as a URL. There are conventions or standards in the operation of the Web: - Protocols for the transfer of resources on the Web. - Domain name addresses it is the written or alphabetic address e.g. www.smh.com - Interne protocol addresses the IP address is the numeric equivalent of the domain name address. Hypermedia searching is when data is retrieved in hypermedia by using search engines. It is different to data retrieval in databases. A search engine is a database of indexed websites. The internet is regularly scanned which then identifies and collects information such as keywords and titles which is then entered into the search engine in index form. Storyboards represent each screen design and the links between them. They are graphic representations of navigation paths and other information. It provides an overall summary or model of how the data will be presented. They are used in video and filmmaking to design individual shots before shooting. The types of storyboard layouts are linear, hierarchical, nonlinear and combination. Web pages are created using hypertext markup language known as HTML. It is a scripting language. HTML files are text files that contain metadata, defining the format of the page when viewed through a browser. There are many software packages on the market that allow the user to create web pages without any knowledge of HTML.
Data Storage
Data is stored for future retrieval for operational purposes, or to replace data that may be lost. Commercial databases are large and usually require large capacity storage media such as hard disks, optical media, removable cartridges and tapes. Optical media is a popular choice because of its low cost and high capacity. CD-ROM and CD-R media can be read by the CD drives on nearly all multimedia capable PCs, but CD-RW and DVD media need more advanced CD drives to read it. Tapes are still popular for data backup and offsite secure storage.
Accessing Data
There are three methods of accessing data: - Direct or sequential data access magnetic disks, optical discs, removable cartridges memory cards are direct access devices which allows almost instant access to data located anywhere on the storage media. It is an advantage. Tapes are sequential access devices as it moves from beginning of tape to data needed. - Online or offline storage online storage is always directly available for use and is fast. Hard disks are the only media that fully qualify as true online storage as it is part of the computer system; cant be removed. Offline storages include optical CDs, cartridge disks, memory cards and tapes. - Shared or distributed databases Shared is when database is on one computer but is available to all other computers on a network. There is only one single online copy of the database. Distributed database is split between several computers in the network. No single computer stores the entire database, but every record in the database is available to all the users in the network. It is when various sections are located in more than one area.
Data Retrieval
In a flat file management system, data can be extracted from a database by entering a query. A query is an instruction to locate certain records in a database. The instruction is usually written according to a set of rules called a query language. The query languages used with most flat file DBMS programs usually have a very simple way of writing a query. The link between the name of the field to be searched and the sample data is called a relational operator. A relational operator describes the test that is to be applied to a particular field in the database. Simple queries can be joined by logical operators to make a more complicated question. A logical operator allows two or more individual queries to be combined. Logical operators are mathematical and use the words AND and OR.
Data Sorting
Sorting is the process of placing data records into a set order based on the contents of one or more fields. It can be a useful method of getting information from a database. A sort operation begins by selecting a particular field, a primary sort field, which will be used to sort all the database records. When data is to be sorted and there is duplicate data in the primary sort field, secondary fields are chosen to do a second-level sort.
Structured Queries
Relational DBMS packages and those used in large computer systems will usually have a much more sophisticated query language an SQL. It is a standard language used to access and manipulate data in databases. It allows much more complicated operations than a simple query. SQL instructions can be written that will load a particular database, find certain data records, sort them into order on selected field, and create a properly formatted report. A DBMS does not have to be running to use this query. An SQL can perform many more tasks than just searching and sorting. It can add new data records and delete old ones. Obviously, it is a lot more complicated using an SQL than a simple query language on a flat file information system.
Encryption
Encryption is a way of masking the true value of data by changing its nature according to an algorithm. It is a way of encoding data. When the data must be read, it is decoded and changed back to its original state by applying the same algorithm in reverse. Encryption is an effective way to achieve security during data transmission and storage. There are two main types or encryption: - Asymmetric encryption requires a key for encryption and a key for decryption. - Symmetric encryption requires the same key for encryption and decryption.
Database Reports
The data must be presented in a way which the person receiving the report are able to understand the contents of the report. Most DBMS programs have the ability to create neatly presented reports from the database. Formatted reports are an easy way to present printed output from a database. Report generators allow you to create your own custom database reports. It allows a variety of output reports to be created. Reports may also contain graphic representations of the data presented in a tabular form. It makes it easier for someone to understand a reports contents when the information is presented graphically. Reports may be regarded as formatted query results. Queries will search the database for the data the user is requesting, using the criteria the user has specified in the query. Reports may be written or may take the form of a multimedia presentation using specialist software and electronic hardware such as a data projector.
Data Accuracy
Data accuracy refers to the accuracy and integrity of the data held in the database. Errors in entering the data can limit the integrity and reliability of the database data. Some sources of data may have greater reliability than others. Data sources need to be checked and verified. Data is reliable if it is updated. Data validation procedures are also used to minimise data entry error: - Range check database fields set to only accept particular range of values/formats. - List check there is a listed set of acceptable values to be accepted. - Data type check request for data type at the dictionary creation level. - Check sum or check digits check digit used to calculate lengthy numeric value to see its validity upon cell data entry.
Data Bias
Data in database must be factual. It must not be emotion-driven, opinionated or biased in any way. This can be minimised by the way the data is collected, manipulated and interpreted. If interviews or surveys are conducted, the questions must be structured so as not to influence the answers to be given. If bias remains in answers, it should be statistically removed during the data manipulation stage.
Backup Procedures
Sufficient backup procedures allow the organisation to continue functioning in the case of any loss of current data by restoring the system to its status at its last data backup period. - Manuel backup systems - may be used in organisations involved in real-time transactions. - Full Periodic System Backups involves full backup of a system. - Daily Transaction Backups daily transaction backups made. - Off-site Backup Systems all backups are stored off=site to protect the organisation in case of disasters such as fire or other catastrophic events.
Security
Restricting unauthorized access to data reinforces privacy rights, reducing the chances of data being corrupted, tampered with or destroyed with malicious intent. Appropriate security measure may consist of logical restrictions: - Password restricted access - Restricted levels of access - Limited time access - Restricted workstation access so that access is only from approved workstations. - A logical firewall that only allows incoming traffic if its been requested. - Encrypted data that can only be unlocked at the receiving end with a private key. There are also physical restrictions: - Restricted physical access to the workstations - Restricted access to the building and locations - Transmissions restricted to physically secure communication lines - Restrictions on the use of wireless networks - Physical or hardware-based firewall devices that only allow incoming traffic if it has been requested, blocking any that is unauthorized.
Privacy
Databases are used to keep information on individuals. It holds information on many different aspects of peoples lives such as school and university students have academic records stored on databases, people have their work records stored on employers databases, medical patients have their medical histories stored on hospital databases, vehicle owners and licensed drivers have their details stored on the RTA databases etc. These many databases are able to be cross-linked, providing people with access to those databases with a great deal of private information about individuals. Many people are unaware of how much personal data about them is stored, and how much are being used by organisations for advertising. They have no control over who has access to the information or how it is used. Most organisations have been forced by consumers and privacy groups to adopt various codes or practice to protect the privacy of individuals whose data they hold. These codes of practice may include: - there must be no info system using personal data kept secret from the public - people have the right to inspect and correct personal data concerning them - personal info mustnt be used for any purpose without prior consent - only authorized people with a genuine need should be able to access and use any personal data
Data Warehousing
A data warehouse is a database of cleaned data and the metadata that describes it. It is electronically controlled from a variety of sources for the purpose of analysis. The process of creating a data warehouse is: - data collection raw data collected from wide variety of sources - data fusion and cleaning raw data files are combined into a single consistent format and checked for accuracy and completeness. - building metadata the data is now described by metadata, telling where each data entity came from, how it was altered to its present form and gives a summary of it. - storage data and metadata are sent to the data warehouse for storage. The data warehouse collects data 24/7, and it may be available for sale to interested parties. Ownership and control of this data are in the hands of the parties who collect it or directly purchase it.
Data Mining
Data mining is the process of searching through data, trying to match any patterns or relationships found in the stored data. Data mining has risks. It is important to set clear goals for data mining operations and to look for other evidence to support the results it produces. The information collected from data warehousing and data mining may impose on peoples privacy as the potential is there for gathering data and uncovering data patterns that may be linked to individuals. Many of the patterns found with data mining have no commercial value and dont assist with decisionmaking, as the relationships detected may be coincidental or irrelevant.