You are on page 1of 123

UNIT I

BBA 208: Computer Application II (Web Technology, HTTP and HTML concepts)

Table of Contents
TOPIC LAN, WAN and MAN Diff bet Internet, Intranet &Extranet Server & Client Slide No 4-9 10-13 15-16

Web / World wide web (WWW) Web browser


IP Address Domain Name System (DNS) , How DNS Works

17 17
19-21 22-33

URL
Diff B/w WWW/Internet Web Caching Proxy Server Firewall Web Portal Cookies Search Engine Web Server
6/26/2012

34-37
38-39 40-45 46-50 51 52 53-55 56-85 86-93 94-98
2

Internet Information Services (IIS) Web Technology

Table of Contents
TOPIC Apache Web server ISP (Internet Service Provider) HTML Definition Slide No 99-102 103-105 106

Internet Protocols
Doing Business on Web Making a Website plan Forming a project team Goals and objectives of website Developing a right business strategy Difference between Apache and IIS Diff between IE and Netscape Navigator

107-110
111 112-113 114-115 116-118 119-120 121 122-123

6/26/2012

Web Technology

What is a Computer Network


A computer network is a group of computers that are connected to each other for the purpose of communication.

A computer network allows sharing of resources and information among devices connected to the network.

6/26/2012

Web Technology

Types of Networking LAN ( Local Area Network)

Network used to interconnect computers in a single room or rooms within a building or nearby buildings is called Local Area Network (LAN). LAN transmits data with a speed of several megabyte per second (106 bytes per second). The transmission medium is normally coaxial or twisted-pair cables. This usually spans about 0-5 kms and is generally a private network owned by an organization. For example: Office LAN, Hospital LAN, Campus-wide LAN, etc.

Major Characteristics of LAN


1. 2. 3. 4. 5.

Each computer has the potential to communicate with any other computer of the network. High degree of interconnection between computers Easy physical connection of computers in a network. Inexpensive medium of data transmission High data transmission rate

6/26/2012

Web Technology

Advantages 1. 2. 3. 4. 5. The reliability of network is high because the failure of one computer in the network does not effect the functioning of other computers. Addition of new computer to network is easy. High rate of data transmission is possible. Peripheral devices like magnetic disk and printer can be shared by other computers.

Disadvantages 1. If the communication line fails, the entire network system breaks down. Use of LAN Following are the major areas where LAN is normally used 1. 2. 3. 4. 5. 6. 7. File transfers and Access Word and text processing Electronic message handling Remote database access Personal computing Digital voice transmission and storage Basic Computing Skills
Web Technology 6

6/26/2012

MAN ( Metropolitan Area Network)


The MAN is used to describe a network of computers spanning a metropolitan city usually 5-50 kms of range. A company having multiple offices in various parts of a city generally uses this type of network. Example is the Cellular or mobile Phone network.

6/26/2012

Web Technology

Characteristics of a MAN
1.The network size falls intermediate between LANs and WANs. A MAN typically covers an area of between 5 and 50 km diameter. Many MANs cover an area the size of a city, although in some cases MANs may be as small as a group of buildings or as large as the North of Scotland. 2.A MAN (like a WAN) is not generally owned by a single organisation. The MAN, its communications links and equipment are generally owned by either a consortium of users or by a single network provider who sells the service to the users. This level of service provided to each user must therefore be negotiated with the MAN operator, and some performance guarantees are normally specified. 3.A MAN often acts as a high speed network to allow sharing of regional resources (similar to a large LAN). It is also frequently used to provide a shared connection to other networks using a link to a WAN.

6/26/2012

Web Technology

Wide Area Network


The term Wide Area Network (WAN) is used to describe a computer network spanning a regional, national or global area. For example, for a large company the head quarters might be at Delhi and regional branches at Bombay, Madras, Bangalore and Calcutta. Here regional centers are connected to head quarters through WAN

6/26/2012

Web Technology

Followings are the major characteristics of WAN.


1. Communication Facility: For a big company spanning over different parts of the country the employees can save long distance phone calls. Computer conferencing is another use of WAN where users communicate with each other through their computer system. 2. Remote Data Entry: Remote data entry is possible in WAN. It means sitting at any location you can enter data, update data and query other information of any computer attached to the WAN but located in other cities or country. 3. Centralized Information: This means if the organization is spread over many cities, they keep their important business data in a single place. As the data are generated at different cities, WAN permits collection of this data from different sites and save at a single site.

6/26/2012

Web Technology

10

Internet The Internet is a network that is open to anyone with access to an Internet Service Provider (ISP). By connecting to the Internet, a user has access to other networked computers all over the world. If a computer that is connected to the Internet is not secured using hardware or software security methods, data on that computer is potentially accessible to anyone on the Internet. Internet Uses TCP/IP Protocol for transmission of data.

6/26/2012

Web Technology

11

Intranet
On the other hand, an intranet is a private network that is setup and controlled by an organization to encourage interaction among its

members, to improve efficiency and to share information, among


other things. Information and resources that are shared on an intranet might include: organizational policies and procedures, announcements, information about new products, and confidential data of strategic value. a web page in an intranet may look and act just like any other webpage on the Internet, but access is restricted to authorized persons and devices . The difference between an intranet

and the Internet is defined in terms of accessibility, size and control.

6/26/2012

Web Technology

12

Extranet
An extranet is an extended intranet. In addition to allowing access to members of an organization, an extranet uses firewalls, access profiles, and privacy protocols to allow access to users from outside the organization. In essence, an extranet is a private network that uses Internet protocols and public networks to securely share resources with customers, suppliers, vendors, partners, or other businesses. Both intranets and extranets are owned, operated and controlled by one organization. However, the difference between intranets and extranets is defined in terms of who has access to the private network and the geographical reach of that network. Intranets allow only members of the organization to access the network, while an extranet allows persons from outside the organization (i.e. business partners and customers) to access the network.

6/26/2012

Web Technology

13

Internet
The Internet is a global system of interconnected computer networks that use the standard Internet Protocol Suite (TCP/IP Transmission Control Protocol / Internet Protocol) to serve billions of users worldwide.

It is a network of networks that consists of millions of private and public, academic, business, and government networks of local to global scope that are linked by a broad array of electronic and optical networking technologies. Each computer on an internet is called a host computer or host. The Internet carries a vast array of information resources and services, for example the inter-linked hypertext documents of the World Wide Web (WWW) and the infrastructure to support electronic mail.
6/26/2012 Web Technology 14

Servers & Clients


Many of the hosts on the Internet offers services to other computers on the Internet. Such computers that provide services for other computers are called Servers. The software run by server computers to provide services is called a server software.

A client is an application or system that accesses a remote service on another computer system (server), by way of a network.

6/26/2012

Web Technology

15

Type of Servers and Clients are : i) Mail Servers : handle incoming and outgoing e-mails. Mail clients get incoming messages from and send outgoing messages to a mail server, and enables to read , write , save and print messages. ii) Web Servers : Store web pages and transmit them in response to requests from web clients, which are usually called browsers. Any computer can be turned into a web server by installing server software like IIS and Apache. iii) FTP servers : stores files that can be transferred to or from a computer which has an FTP client. FTP server are used to transfer file between computers. iv) Proxy Server: Proxy Server connects itsel with different web servers and requests the service on behalf of its clients. Proxy server may optionally alter the client request or server response.
6/26/2012 Web Technology 16

Web / WWW
World wide web (WWW) or Web in short, is a system of Internet servers that support specially formatted documents. The documents are formatted in a markup language called HTML (HyperText Markup Language) that supports links to other documents, as well as graphics, audio, and video files. This means one can jump from one document to another simply by clicking on hot spots.

It is a way of accessing information over the medium of the Internet. It is an information-sharing model that is built on top of the Internet. The Web uses the Hyper Text Transfer Protocol (HTTP) to transmit data.

6/26/2012

Web Technology

17

Web Browser

It is a software application used to locate, retrieve and also display content on the World Wide Web, including Web pages, images, video and other files. As a client/server model, the browser is the client run on a computer that contacts the Web server and requests information. The Web server sends the information back to the Web browser which displays the results on the computer or other Internet-enabled device that supports a browser. Browsers offer plug-ins which extend the capabilities of a browser so it can display multimedia information (including sound and video), or the browser can be used to perform tasks such as videoconferencing, to design web pages or add anti-phishing filters and other security features to the browser.

Example of Web Browser : Netscape, FireFox etc.


6/26/2012 Web Technology 18

Internet Protocol ( IP) Address


Internet is a global network of computers each computer connected to the Internet must have a unique address.
Internet addresses are in the form nnn.nnn.nnn.nnn where nnn must be a number from 0 - 255. This address is known as an IP address (where IP stands for Internet Protocol) .

Computer A

Computer B

6/26/2012

Web Technology

19

Internet Protocol ( IP) Address


IP addresses identify the host computers, so that the packets ( chunks of data to be transmitted) of information reach the correct computer. IP addresses are 32-bit numbers normally expressed as four "octets" in a "dotted decimal number." Eg. 70.42.251.42 The four numbers in an IP address are called octets because they can have values between 0 and 255 (28 possibilities per octet).

6/26/2012

Web Technology

20

Internet Protocol ( IP) Address


When a computer is configured to use the same IP address each time it powers up, this is known as a Static IP address. In contrast, in situations when the computer's IP address is assigned automatically, it is known as a Dynamic IP address. If a computer is connected to the Internet using a dial-up account, the Internet service provider (ISP) assigns the computer an IP address each time the internet is connected . If a high-speed DSL ( Digital subscriber line) or cable Internet account is used then the ISP may use a static (unchanging) IP address or may assign an address each time the internet is connected.

6/26/2012

Web Technology

21

Domain Name System


Every computer that hosts data on Internet has a unique numerical address called as the IP( Internet protocol ) address. So that people dont have to remember strings of numbers, host computers also have names called as the Domain name. Domain name is a unique, case-insensitive, name, consisting of a string made up of alphanumeric characters and dashes separated by periods, that the Domain Name System maps to IP numbers and other information.

Example of Domain Name is : google.com, hindustantimes.com

6/26/2012

Web Technology

22

Domain Name System


A domain name usually consists of two or more parts (technically labels), which are conventionally written separated by dots, such as google.com. DNS is an abbrevation for Domain Name System (or service or server). It is an Internet service that translates domain names into IP addresses. The ISP (Internet Service Provider) provides a DNS server to handle domain name translations. For example, the machine that we refer to as "www.google.com" has the IP address 74.125.45.100. Every time a domain name is used, the Internet's domain name servers (DNS)translate the human-readable domain name into the machine-readable IP address.

6/26/2012

Web Technology

23

Domain Name System


DNS names are organized into a tree.
DNS root

com

net Example.net

us

jp

Top- level domains

Second-level domains ny www.trumansburg.ny.us


24

Third-level domains

www.example.net

6/26/2012

Web Technology

Domain Name System


The tree can extend to any number of levels, but in practice it is rarely more than four or five levels deep. All names start at the root, above the set of top-level domains. The root is considered to be at the right end of the name. Below the top-level domains in the domain name hierarchy are the second-level domain (SLD) names. These are the names directly to the left of .com, .net, and the other top-level domains. As an example, in the domain en.wikipedia.org, wikipedia is the second-level domain. Next are third-level domains, which are written immediately to the left of a second-level domain. There can be fourth- and fifth-level domains, and so on.
6/26/2012 Web Technology 25

Domain Name System


The Top-level domains(TLD) are divided into three major categories: Generic Country Specialized Generic top-level domains(gTLDs)- Generic domains permit anyone from any part of the world to register.

6/26/2012

Web Technology

26

Some of these are : com originally for commercial organizations, but now used by individuals, government agencies , and non-profits as well. net Internet service providers and other network-related companies org non-commercial organizations biz this is an alternative to com that opened in 2001. In theory, its for business , but in practice anyone can register. info This new TLd was opened in 2001. Anyone can register name this is for peoples names eg, john.smith.name. It opened in early 2002.

6/26/2012

Web Technology

27

Country-code top level domains( ccTLDs): These are two letter long domains, reserved for a country. Some of the ccTLDs are : us U.S ca Canada in India uk United Kingdom fr france

6/26/2012

Web Technology

28

Specialized Domains: The DNS has always a few domains that are restricted to particular kinds of organization. If one doesn't qualify, he cannot get it. Example: edu gov mil int normally for four year degree colleges and universities. originally for any kind of government in the U.S. New registrations are limited to the federal government. this is for U.S military This is for international treaty organizations like the red cross.
Web Technology 29

6/26/2012

How DNS works


DNS implements a distributed database to store the name and address information for all public hosts on the Internet. The DNS database resides on a hierarchy of special database servers. The following diagram explains the steps involved when clients like Web browsers issue requests for Internet host names.
6/26/2012 Web Technology 30

Hostname (neon.tcpip-lab.edu)

HTTP

IP address (128.143.71.21)

Resolver
IP Address IP address (128.143.71.21)

Hostname (neon.tcpip-lab.edu)

1. An application program on a host accesses the domain system through a DNS client, called the resolver 2. Resolver contacts DNS server, called name server

Name server

6/26/2012

Web Technology

31

How DNS works


3. DNS server returns IP address to resolver which passes the IP address to application

4. If the DNS server does not contain the needed mapping,it will in turn forward the request to a different DNS server at the next higher level in the hierarchy

6/26/2012

Web Technology

32

How DNS works

6/26/2012

Web Technology

33

URL
URL (Uniform Resource Locator) is the unique address for a file that is accessible on the internet. A URL is a URI (Uniform Resource Identifier) that, in addition to identifying a resource, provides a means of acting upon or obtaining a representation of the resource by describing its primary accessmechanism or network "location". In order to search and view a file, we need to enter the URL of the file in the web browsers address line. Such a file can be a web page, an image file or a program such as a java applet.

6/26/2012

Web Technology

34

URL
The URL consists of three parts. How :The first part is protocol identifier and it indicates what protocol to use to access the resource. Where :The second part is the domain name and it identifies the specific computer on the internet. What :Specifies the complete path to the file and the files name that is being requested by the client.

6/26/2012

Web Technology

35

URL
Eg. A URL for a particular image on a website might be :

http://www.ietf.org/rfc/rfc4534.jpg
Protocol

Domain Name

Path name

6/26/2012

Web Technology

36

URL
Format of URL:
scheme://host.domain:port/path/filename scheme - defines the type of Internet service. The most common type is HTTP or FTP host - defines the domain host (the default host for http is www) domain - defines the Internet domain name, like w3schools.com port - defines the port number at the host (the default port number for http is 80) path - defines a path at the server (If omitted, the document must be stored at the root directory of the web site) filename - defines the name of a document/resource http://www.abc.com:8080/index.html

6/26/2012

Web Technology

37

Difference between WWW & Internet


The Internet is a massive network of networks. It connects millions of computers together globally, forming a network in which any computer can communicate with any other computer as long as they are both connected to the Internet. Information over the Internet travels from computer to computer via protocols. The World Wide Web, or simply Web, is a way of accessing information over the medium of the Internet. It is an informationsharing model that is built on top of the Internet. The Web uses the HTTP protocol, only one of the languages spoken over the Internet, to transmit data. Internet services, which use HTTP to allow applications to communicate in order to exchange business logic, use the Web to share information
6/26/2012 Web Technology 38

Difference between WWW & Internet


The Web also utilizes browsers, such as Internet Explorer or Firefox, to access Web documents called Web pages that are linked to each other via hyperlinks. Web documents also contain graphics, sounds, text and video. The Web is just one of the ways that information can be spread over the Internet. The Internet is also used for the following services: e-mail which uses Simple mail Transfer protocol (SMTP) File transfer which uses File transfer protocol (FTP) Usenet which uses Network news Transfer protocol (NNTP) WWW therefore is just a portion of the Internet.
6/26/2012 Web Technology 39

Web Caching
Web caching is the temporary storage of web objects (such as HTML documents) for later retrieval. A Web cache sits between Web servers and a client and watches requests come by, saving copies of the responses like HTML pages, images and files (collectively known as representations) for itself. Then, if there is another request for the same URL, it can use the response that it has, instead of asking the origin server for it again.

6/26/2012

Web Technology

40

Web Caching
There are three main reasons that Web caches are used: To reduce latency Because the request is satisfied from the cache (which is closer to the client) instead of the origin server, it takes less time for it to get the document and display it. To reduce network traffic (reduced bandwidth consumption) Because documents are reused, it reduces the amount of bandwidth used by a client. This saves money if the client is paying for traffic, and keeps their bandwidth requirements lower and more manageable.

To reduce Server load Because the request is satisfied from the cache there are fewer requests for a server to handle.
It makes the web less expensive and better performing.
6/26/2012 Web Technology 41

Web Caching
Kinds of Web Caches

Browser cache Proxy cache Gateway cache

6/26/2012

Web Technology

42

Browser cache
The Browser cache works at browser level of the individual client. The Browser Stores the frequently visited web pages inside cache memory allocated to browser. So that next user visits the same page, the browser can represent it from the cache memory. This cache is especially useful when users hit the back button or click a link to see a page theyve just looked at. Also, if you use the same navigation images throughout your site, theyll be served from browsers caches almost instantaneously.

The preferences dialog of any modern Web browser (like Internet Explorer, Safari or Mozilla), includes cache setting.

6/26/2012

Web Technology

43

Proxy cache
Web proxy caches work on the same principle, but a much larger scale. Proxies serve hundreds or thousands of users in the same way. Large corporations and ISPs often set them up on their firewalls, or as standalone devices (also known as intermediaries).

Proxy caches are a type of shared cache; rather than just having one person using them, they usually have a large number of users, and because of this they are very good at reducing latency and network traffic

6/26/2012

Web Technology

44

Gateway cache

Also known as reverse proxy caches or surrogate caches, gateway caches are also intermediaries, but instead of being deployed by network administrators to save bandwidth, theyre deployed by Webmasters themselves, to make their sites more scalable, reliable and better performing.

6/26/2012

Web Technology

45

Proxy Server
Proxy server is a server that sits between a client application, such as a Web browser, and a real server to ensure security, administrative control, and caching service. A proxy server is associated with or part of a gateway server that separates the enterprise network from the outside network and a firewall server that protects the enterprise network from outside intrusion.

6/26/2012

Web Technology

46

Proxy Server

Schematic representation of a proxy server, where the computer in the middle acts as the proxy server between the other two.
6/26/2012 Web Technology 47

Proxy servers have two main purposes:


Improve Performance: Proxy servers can improve performance for groups of users. This is because it saves the results of all requests for a certain amount of time. Consider the case where both user X and user Y access the World Wide Web through a proxy server. First user X requests a certain Web page, say Page 1. Sometime later, user Y requests the same page. Instead of forwarding the request to the Web server where Page 1 resides, which can be a time-consuming operation, the proxy server simply returns the Page 1 that it already fetched for user X. Since the proxy server is often on the same network as the user, this is a much faster operation.

6/26/2012

Web Technology

48

Proxy servers have two main purposes:


Filter Requests: Proxy servers can also be used to filter requests. For example, a company might use a proxy server to prevent its employees from accessing a specific set of Web sites.

6/26/2012

Web Technology

49

Proxy Server
A proxy server receives a request for an Internet service (such as a Web page request) from a user. If it passes filtering requirements, the proxy server, assuming it is also a cache server , looks in its local cache of previously downloaded Web pages. If it finds the page, it returns it to the user without needing to forward the request to the Internet. If the page is not in the cache, the proxy server, acting as a client on behalf of the user, uses one of its own IP addresses to request the page from the server out on the Internet. When the page is returned, the proxy server relates it to the original request and forwards it on to the user. To the user, the proxy server is invisible; all Internet requests and returned responses appear to be directly with the addressed Internet server.
6/26/2012 Web Technology 50

Firewall
A firewall is a part of a computer system or network that is designed to block unauthorized access while permitting authorized communications. It is a device or set of devices configured to permit, deny, encrypt, decrypt, or proxy all (in and out) computer traffic between different security domains based upon a set of rules and other criteria. Firewalls can be implemented in either hardware or software, or a combination of both. Firewalls are frequently used to prevent unauthorized Internet users from accessing private networks connected to the Internet, especially intranets. All messages entering or leaving the intranet pass through the firewall, which examines each message and blocks those that do not meet the specified security criteria.
6/26/2012 Web Technology 51

Web Portal
A portal is a web site that offers a broad array of resources and services , such as e-mail, forums, search engine and on-line shopping malls. Portal is a term for a World Wide Web site that proposes to be a major starting site for users when they get connected to the Web or that users tend to visit as an anchor site. Some major general portals include Yahoo, Excite and Netscape.

6/26/2012

Web Technology

52

Cookies
A cookie is a piece of text that a Web server can store on a user's hard disk. Cookies allow a Web site to store information on a user's machine and later retrieve it. The pieces of information are stored as name-value pairs. For example, a Web site might generate a unique ID number for each visitor and store the ID number on each user's machine using a cookie file. The main purpose of a cookie is to identify users and possibly prepare customized Web pages for them.

6/26/2012

Web Technology

53

Cookies
When we enter a Web site using cookies, we may be asked to fill out a form providing personal information; like name, e-mail address, and interests. This information is packaged into a cookie and sent to the Web browser, which then stores the information for later use. The next time we go to the same Web site, the browser will send the cookie to the Web server. The server can use this information to present with custom Web pages. So, for example, instead of seeing just a generic welcome page we might see a welcome page with our name on it.
6/26/2012 Web Technology 54

Types of Cookies
Session Cookie Also called a transient cookie, a cookie that is erased when you close the Web browser. The session cookie is stored in temporary memory and is not retained after the browser is closed. Session cookies do not collect information from our computer. They typically will store information in the form of a session identification that does not personally identify the user. Persistent Cookie Also called a permanent cookie, or a stored cookie, a cookie that is stored on our hard drive until it expires (persistent cookies are set with expiration dates) or until we delete the cookie. Persistent cookies are used to collect identifying information about the user, such as Web surfing behavior or user preferences for a specific Web site.

6/26/2012

Web Technology

55

Search Engine
A search engine is a program that searches documents for specified keywords and returns a list of documents where the keywords were found.

A web search engine is designed to search for information on the World Wide Web. The search results are usually presented in a list of results and are commonly called hits. The information may consist of web pages, images, information and other types of files.

6/26/2012

Web Technology

56

Search Engine
A search engine is a key to finding specific information on the vast expanse of WWW. Without search engines it would be virtually impossible to locate anything on the Web without knowing the specific URL.

Example : google, yahoo, msn etc.


6/26/2012 Web Technology 57

Types of Search Engines:


There are three types of search engines:
1. 2. 3.

Spider-based search engines Human powered directories Hybrid Search engines

6/26/2012

Web Technology

58

Crawler-based search engines:


Spider-based search engines use automated software programs to survey and categorize web pages. The programs used by the search engines to access your web pages are called spiders, crawlers, robots or bots.
A spider visits a web site , reads the information on the actual site, reads the sites meta tags and also follows the links that the site connects to performing indexing on all the linked web sites as well. The spider returns all that information back to the central repository , where the data is indexed.

6/26/2012

Web Technology

59

Crawler-based search engines:


The spider periodically returns to the sites to check any information that has changed. The frequency with which this happens is determined by the administrators of the search engine. Examples of crawler-based search engines are:

Google (www.google.com) Ask Jeeves (www.ask.com)

6/26/2012

Web Technology

60

Human Powered Directories


A directory uses human editors who decide what category the site belongs to; they place websites within specific categories in the directories database. The human editors comprehensively check the website and rank it, based on the information they find, using a pre-defined set of rules. Examples : Open Directory (www.dmoz.org) Yahoo Directory (dir.yahoo.com)

6/26/2012

Web Technology

61

Hybrid Search Engines


Hybrid search engines use a combination of both spider-based results and directory results. More and more search engines these days are moving to a hybrid-based model. Examples of hybrid search engines are: MSN Search

6/26/2012

Web Technology

62

Working of a Basic search engine


A search engine operates, in the following order Web spider Indexing Searching

6/26/2012

Web Technology

63

Web spider
A Web spider is a computer program that browses the World Wide Web in a methodical, automated manner. This process is called Web crawling or spidering. Other terms for Web crawlers are ants, automatic indexers, bots, Web spider and Web robot .

To find information on the hundreds of millions of Web pages that exist, a search engine employs spiders, to build lists of the words found on Web sites. When a spider is building its lists, the process is called Web crawling.

6/26/2012

Web Technology

64

"Spiders" take a Web page's content and create key search words that enable online users to find pages they're looking for.

6/26/2012

Web Technology

65

1. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book - it tells which pages contain the words that match the query. 3. The search results are returned to the user in a fraction of a second.

6/26/2012

2. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result. Web Technology 66

Indexing
Once the spiders have completed the task of finding information on Web pages ,the contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages are stored in an index database for use in later queries. The purpose of an index is to allow information to be found as quickly as possible.

6/26/2012

Web Technology

67

Searching
Searching through an index involves a user building a query and submitting it through the search engine. The query can be quite simple, a single word at minimum. Building a more complex query requires the use of Boolean operators that allow you to refine and extend the terms of the search.

6/26/2012

Web Technology

68

Searching Techniques
With a search engine, keywords related to a topic are typed into a search "box." The search engine scans its database and returns a file with links to websites containing the word or words specified. Because these databases are very large, search engines often return thousands of results. To use search engines effectively, it is essential to apply techniques that narrow results and push the most relevant pages to the top of the results list. Below are a number of strategies for boosting search engine performance.

6/26/2012

Web Technology

69

Searching Techniques
IDENTIFY KEYWORDS

When conducting a search, break down the topic into key concepts. For example, to find information on what the Vodafone has said about the wireless communications industry, the keywords might be: Vodafone wireless communication
6/26/2012 Web Technology 70

Searching Techniques
BOOLEAN AND Connecting search terms with AND tells the search engine to retrieve web pages containing ALL the keywords. FCC and wireless and communication The search engine will only return pages where the words FCC, wireless, and communication all appear somewhere on the page. Thus, AND helps to narrow search results as it limits results to pages where all the keywords appear.

6/26/2012

Web Technology

71

Searching Techniques
BOOLEAN OR Linking search terms with OR tells the search engine to retrieve web pages containing ANY and ALL keywords. (FCC or wireless or communication) When OR is used, the search engine returns pages with a single keyword, several keywords, and all keywords. Thus, OR expands search results. We should sue OR when there are common synonyms for a keyword. OR statements should be surrounded with parentheses for best results. To narrow results as much as possible we should combine OR statements with AND statements.

For example, the following search statement locates information on purchasing a used car:
(car or automobile or vehicle) and (buy or purchase) and used
6/26/2012 Web Technology 72

Searching Techniques
BOOLEAN AND NOT AND NOT tells the search engine to retrieve web pages containing one keyword but not the other. dolphins and not Miami The above example instructs the search engine to return web pages about dolphins but not web pages about the "Miami Dolphins" football team. We should use AND NOT when we have a keyword that has multiple meanings.

6/26/2012

Web Technology

73

Searching Techniques
IMPLIED BOOLEAN: PLUS & MINUS

In many search engines, the plus and minus symbols can be used as alternatives to full Boolean AND and AND NOT. The plus sign (+) is the equivalent of AND, and the minus sign (-) is the equivalent of AND NOT. There is no space between the plus or minus sign and the keyword.

6/26/2012

Web Technology

74

Searching Techniques
PHRASE SEARCHING Surrounding a group of words with double quotes tells the search engine to only retrieve documents in which those words appear side-by-side. Phrase searching is a powerful search technique for significantly narrowing your search results, and it should be used as often as possible. "John F. Kennedy" "Walt Disney World" "global warming"

6/26/2012

Web Technology

75

Searching Techniques
For best results, combine phrase searching with implied Boolean (+/-) or full Boolean (AND, OR, and AND NOT) logic.

+"heart disease" +cause "heart disease" and cause The above example tells the search engine to retrieve pages where the words heart disease appear side-byside and the word cause appears somewhere else on the page

6/26/2012

Web Technology

76

Searching Techniques
PLURAL FORMS, CAPITAL LETTERS, AND ALTERNATE SPELLINGS

Most search engines interpret lower case letters as either upper or lower case. Thus, if we want both upper and lower case occurrences returned, we should type the keywords in all lower case letters. However, if we want to limit results to initial capital letters (e.g., "George Washington") or all upper case letters, we must type keywords that way.

6/26/2012

Web Technology

77

Searching Techniques
Like capitalization, most search engines interpret singular keywords as singular or plural. If we want plural forms only, we should make the keywords plural. A few search engines support truncation or wildcard features that allow variations in spelling or word forms. The asterisk (*) symbol tells the search engine to return alternate spellings for a word at the point that the asterisk appears. For example, capital* returns web pages with capital, capitals, capitalize, and capitalization.

6/26/2012

Web Technology

78

Searching Techniques
TITLE SEARCH

Field searching is one of the most effective techniques for narrowing results and getting the most relevant websites listed at the top of the results page. A web page is composed of a number of fields, such as title, domain, host, URL, and link. Searching effectiveness increases as we combine field searches with phrase searches and Boolean logic. For example, if we want to find information about George Washington and his wife Martha, you could try the following search:

6/26/2012

Web Technology

79

Searching Techniques
+title:"George Washington" +President +Martha title:"George Washington" and President and Martha The above TITLE SEARCH example instructs the search engine to return web pages where the phrase George Washington appears in the title and the words President and Martha appear somewhere on the page. Like plus and minus, there is no space between the colon (:) and the keyword.
6/26/2012 Web Technology 80

Searching Techniques
DOMAIN SEARCH The DOMAIN SEARCH allows you to limit results to certain domains such as websites from the United Kingdom (.uk), educational institutions (.edu), or government sites (.gov).

+domain:uk +title:"Queen Elizabeth" domain:uk and title:"Queen Elizabeth"


+domain:edu +"lung cancer" +smok* domain:edu and "lung cancer" and smok*

6/26/2012

Web Technology

81

Searching Techniques
HOST SEARCH

The HOST SEARCH comes in handy when you need to find something located at a large site that does not have an internal search engine. With this search technique, you can search all the pages at a website (contained in the engine's database) for keywords or phrases of interest.

6/26/2012

Web Technology

82

Searching Techniques
+host:www.disney.com +"special offer" host:www.disney.com and "special offer"

6/26/2012

Web Technology

83

Searching Techniques
URL SEARCH
The URL SEARCH limits search results to web pages where the keyword appears in the URL or website address. A URL search can narrow very broad results to web pages devoted to the keyword topic. +url:halloween +title:stories url:halloween and title:stories

6/26/2012

Web Technology

84

Searching Techniques
LINK SEARCH Use the LINK SEARCH when you want to know what websites are linked to a particular site of interest. For example, if you have a home page and you are wondering if anyone has put a link to your page on their website, use the Link search. Researchers use link searches for conducting backward citations.

link:www.pepsi.com link:www.ipl.org/ref/

6/26/2012

Web Technology

85

Web Server
A web server is a computer, running application software that listens and responds to a client computers request made through a web browser . It hosts web pages and other web documents . Web servers provide web documents and other services using hypertext transfer protocol (HTTP). Any computer can be turned into a Web server by installing server software and connecting the machine to the Internet. A Web server software often comes as a large package of Internet and Intranet-related programs for serving e-mail, downloading requests for File Transfer Protocol (FTP) files, and building and publishing web pages.

6/26/2012

Web Technology

86

Web Server
Two leading Web servers are Apache and Microsofts Internet Information Server (IIS). The primary function of a web server is to deliver web pages to clients. This includes delivery of HTML documents and any additional content that may be included by a document, such as images, style sheets and JavaScripts. A client, commonly a web browser or a web crawler, initiates communication by making a request for a specific resource (e.g. a file) using HTTP and the server responds with the content of that resource, or an error message if unable to do so.

6/26/2012

Web Technology

87

Web Server
While the primary function is to serve content, a full implementation of HTTP also includes a way of receiving content from clients. This feature is used for submitting web forms, including uploading of files.

6/26/2012

Web Technology

88

Common features of Web servers

Virtual hosting to serve many web sites using one IP address. Large file support to be able to serve files whose size is greater than 2 GB on 32 bit OS.

Bandwidth throttling to limit the speed of responses in order to not saturate the network and to be able to serve more clients.
Server-side scripting to generate dynamic web pages.

6/26/2012

Web Technology

89

How Web server works


The steps followed when we type a URL into a web browser are :1) If the URL contains a domain name, the browser first connects to a domain name server and retrieves the corresponding IP address for the web server.

2) The web browser connects to the web server and sends an HTTP request (via the protocol stack) for the desired web page.

6/26/2012

Web Technology

90

How Web server works


3) The web server receives the request and checks for the desired page. If the page exists, the web server sends it. If the server cannot find the requested page, it will send an HTTP 404 error message. (404 means 'Page Not Found)

6/26/2012

Web Technology

91

How Web server works


4) The web browser receives the page. 5) The browser then parses through the page and looks for other page elements it needs to complete the web page. These usually include images, applets, etc.

6/26/2012

Web Technology

92

How Web server works


6) For each element needed, the browser makes additional connections and HTTP requests to the server for each element. 7) When the browser has finished loading all images, applets, etc. the page will be completely loaded in the browser window.
6/26/2012 Web Technology 93

Internet Information Service


Internet Information Services (IIS), formerly called Internet Information Server - is a set of Internet-based services for servers created by Microsoft for use with Microsoft Windows. It is the world's second most popular web server in terms of overall websites behind the industry leader Apache HTTP Server.

6/26/2012

Web Technology

94

Internet Information Service


The services provided currently include FTP, FTPS, SMTP, NNTP, and HTTP/HTTPS. The following table displays the versions and the respective Operating systems on which it can run. Versions IIS 1.0, Windows NT 3.51 available as a free add-on IIS 2.0, Windows NT 4.0 IIS 3.0, Windows NT 4.0 Service Pack 3 IIS 4.0, Windows NT 4.0 Option Pack IIS 5.0, Windows 2000
6/26/2012 Web Technology 95

Internet Information Service

IIS 5.1, Windows XP Professional, IIS 6.0, Windows Server 2003 and Windows XP Professional x64 Edition IIS 7.0, Windows Server 2008 and Windows Vista (Home Premium, Business, Enterprise, Ultimate Editions) IIS 7.5, Windows Server 2008 R2 and Windows 7

6/26/2012

Web Technology

96

Internet Information Service


A company that buys IIS can create pages for Web sites using Microsoft's Front Page product (with its WYSIWYG user interface). Web developers can use Microsoft's Active Server Page (ASP) technology, which means that applications - including ActiveX controls - can be imbedded in Web pages that modify the content sent back to users. Developers can also write programs that filter requests and get the correct Web pages for different users by using Microsoft's Internet Server Application Program Interface (ISAPI) interface.
6/26/2012 Web Technology 97

Internet Information Service


Microsoft includes special capabilities for server administrators, designed to appeal to Internet service providers (ISPs). It includes a single window (or "console") from which all services and users can be administered. It's designed to be easy to add components as snap-ins. The administrative windows can be customized for access by individual customers. IIS is susceptible to computer virus attacks such as Code Red and Nimda.
6/26/2012 Web Technology 98

Apache Web Server


The Apache HTTP Server,commonly referred to as Apache is the most popular web server software. It is a free software distributed by the Apache Software Foundation.

6/26/2012

Web Technology

99

Apache Web Server


The original version of Apache was written for UNIX and was developed in 1995. But now, it is available for a wide variety of operating systems, including Linux, Solaris, Novell NetWare, Mac OS X, Microsoft Windows, and OS/2 . The majority of web servers using Apache run a Linux operating system. Released under the Apache License, Apache is characterized as open source software.

6/26/2012

Web Technology

100

Apache Web Server


Open source software refers to a software in which the source code is available to the general public for use and/ or modification from its original design free of cost. Core development of the Apache Web server is performed by a group of about 20 volunteer programmers, called the Apache Group. However, because the source code is freely available, anyone can adapt the server for specific needs, and there is a large public library of Apache addons.

6/26/2012

Web Technology

101

Apache Web Server


Apache is primarily used to serve both static content and dynamic Web pages on the World Wide Web. Many web applications are designed expecting the environment and features that Apache provides.

Apache is used for many other tasks where content needs to be made available in a secure and reliable way. One example is sharing files from a personal computer over the Internet. A user who has Apache installed on their desktop can put arbitrary files in Apache's document root which can then be shared.
6/26/2012 Web Technology 102

ISP
Short for Internet Service Provider, it refers to a company that provides Internet services, including personal and business access to the Internet. For a monthly fee, the service provider usually provides a software package, username, password and access phone number. Equipped with a modem, you can then log on to the Internet and browse the World Wide Web and send and receive e-mail. For broadband access you typically receive the broadband modem hardware or pay a monthly fee for this equipment that is added to your ISP account billing.

6/26/2012

Web Technology

103

How does ISP Work


Let's start with the procedure right from the local computer. Home computers connect to the ISP using telephone cables or broadband Internet connections. Large networks like that of educational institutes connect to the ISP using a D1 line. The way of logging into the ISP is same for both. For connecting to the Internet, you will need a modem and an ISP subscription. Let's try to understand the entire procedure in a step by step process: The first step is to login into the ISP using the user information provided to you by your ISP. Here, you enter the username, password and telephone number of the ISP. Once the ISP receives your information in its modem pool, it verifies if you are an authentic user or not. Once the user authentication process is done, the ISP provides you with a dynamic IP address using the DHCP.

6/26/2012

Web Technology

104

How does ISP Work


If you have bought a static IP from your ISP, then this step is not required. However, buying a static IP will cost you a lot. Now, you are allowed to browse any web page through your web browser. When you type in the name of the URL on the address bar, you are actually requesting for the IP address of the server machine, that holds those web pages. The information is received at the modem pool. Once this information is received, the ISP connects the subscriber to the modem pool. The requested server machine is reached through an array of dedicated lines and routers. Once the ISP finds the required IP address, it transfers the requested web pages to the source IP address.
6/26/2012 Web Technology 105

HTML
HyperText : Hypertext is text displayed on a computer or other electronic device with references (hyperlinks) to other text that the reader can immediately access, usually by a mouse click or keypress sequence. Markup Languages: Markup languages are designed for the processing, definition and presentation of text. The language specifies code for formatting, both the layout and style, within a text file. The code used to specify the formatting are called tags. HTML is a an example of a widely known and used markup language. A well-known example of a markup language in widespread use today is HyperText Markup Language (HTML), one of the document formats of the World Wide Web.
6/26/2012 Web Technology 106

Internet Protocols
HTTP:
Hypertext Transfer Protocol is the set of rules for transferring files (text, graphic, images, multimedia files etc) on the WWW. HTTP is an application layer protocol and used to transmit resources not only just files. A web browser is an HTTP client, sending requests to server machines. When user types an URL in browser, the browser builds an HTTP request and sends it to the Server and Server sends the Response to web browser.

6/26/2012

Web Technology

107

Methods used in HTTP


a. GET: Most The GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI. The GET method can also be used to submit forms. The form data is URL-encoded and appended to the request URI b. POST : Post method is also used to retrieve data from server but the difference from GET is block of data sent with the request, in the message body. A POST request is different from a GET request in the following ways: There's a block of data sent with the request, in the message body. There are usually extra headers to describe this message body, like Content-Type: and Content-Length: The request URI is not a resource to retrieve; it's usually a program to handle the data you're sending. The HTTP response is normally program output, not a static file. The most common use of POST, by far, is to submit HTML form data to CGI scripts
6/26/2012 Web Technology 108

Methods used in HTTP

Head: A HEAD request is just like a GET request, except it asks the server to return the response headers only, and not the actual resource (i.e. no message body). This is useful to check characteristics of a resource without actually downloading it, thus saving bandwidth. Use HEAD when you don't actually need a file's contents. PUT: The PUT method requests that the enclosed entity be stored under the supplied URL. If the URL refers to an already existing resource, the enclosed entity should be considered as a modified version of the one residing on the origin server.

6/26/2012

Web Technology

109

SMTP and MIME Protocol


SMTP / MIME : Simple Mail Transfer Protocol (SMTP) is used for e-mail transmission across the Internet. SMTP is relatively simple, text-based protocol. a protocol for sending e-mail messages between servers. Most email systems that send mail over the Internet use SMTP to send messages from one server to another; the messages can then be retrieved with an email client using either POP or IMAP. In addition, SMTP is generally used to send messages from a mail client to a mail server. Multimedia Internet Mail Extensions (MIME) a specification for formatting non-ASCII messages so that they can be sent over the Internet. Many e-mail clients now support MIME, which enables them to send and receive graphics, audio, and video files via the Internet mail system. In addition, MIME supports messages in character sets other than ASCII. There are many predefined MIME types, such as GIF graphics files and PostScript files.

6/26/2012

Web Technology

110

Doing Business on the Web


Internet made our lives easier. But still Internet is unable to change the way we shop despite the initial predication that internet will change the way we shop. The consumer is still going to markets and malls. The reason the Internet failed in area of retail shopping is that most of the products requires facet-to-face selling which cannot be done over the internet. Drawbacks of doing business on the web. 1. Lack of trust : consumer dont have trust on the websites selling non branded products. 2. Delivery Expense : Consumer has to pay extra delivery expense. 3. Consumer cannot see the product before purchasing it.

6/26/2012

Web Technology

111

Making a web site plan


Web site plan includes following points. 1. Project Scope : What will be included in the project and what will not be included. 2. Audience : Identification of visitors and there characteristics 3. Competition : who are the competitors and what do their websites do and look like 4. Overall goal : What does the business want to achieve through the website. 5. Marketing and branding strategies : Creating strategies to promote the website. 6. Technology : Which technology should be used to create the website.
6/26/2012 Web Technology 112

Making a web site plan


7. Budget : What is the budget for the project ? What is the budget for maintenance and updates ? 8. Plan the list of web pages appear in the website. a. Home page b. Products/ Services c. Contact us d. Pricing e. FAQ f. About Us g. Policy h. Links
6/26/2012 Web Technology 113

Forming a Project Team


Website Project Team may consist of following persons.
1.

2.

3.

4.

Project Manager : Determines project needs, oversees the process, maintain client communication. Handle budget, updates and other issues. Art Designer : Oversees visual design process and designs the look and feel of the website. Information Designer : He defines the navigational and functional perspective of the website. Determines the overall content strategy and site structure. Developer : Creates the actual web page using HTML and other scripting languages.

6/26/2012

Web Technology

114

Forming a Project Team


5. Content Manager : Gather the information for the website from different sources. 6. Database Programmer : Maintains the database of the website used to store information about clients, purchase and sales. 7. Web Promotional specialist : Person responsible for website promotion and search engine optimization.

6/26/2012

Web Technology

115

Setting Goals and objectives of website


Deciding the Purpose The first important step is to determine the purpose of the website. It includes clear understanding why the customer wants the website. For example website provides the services, tries to sell a product, present information etc. Goals Information : The purpose decide about what the website displays or what it is meant for, the goal will decide what the site want to accomplish in future. The goal should be realistic, specific, challenging and measurable.

6/26/2012

Web Technology

116

Objective of the website


Following can be the objective of the website.
1.

2.

3.

E-commerce : Website is directly selling the products to consumers. The consumer can see the products, features, prices , comparison between products and finally purchase the product. Lead Generation : Providing information about the products and educate the customer about the product, so that customer can contact you through phone, email, form and fax etc to purchase the product. Content : Research shows that user are looking for information about products, prices and company. So that they can purchase the product.
Web Technology 117

6/26/2012

Objective of the website


4. Self Service : Website can provide services to the consumers like purchase of products , registration of complaints, checking of status etc.

6/26/2012

Web Technology

118

Developing the Right Business Strategy


1.

Define your target audience : The design and content of a site should attract the target audience. For e.g. website for teenage should use more graphics and animation. Develop appropriate content for your website : The content on the website should be interesting to keep the viewer on the website. Align your site with online communities and website partners : The website should be advertised on online communities like facebook, forums etc and website links should be present on similar websites.

2.

3.

6/26/2012

Web Technology

119

Developing the Right Business Strategy

Develop an support network : Formation of support team to reply the emails and maintaining the website. Refine your marketing and promotion materials : Promote your website through advertisement, press release, trade shows, online discounts, etc. Obtain appropriate feedback : Get feedback from the customer about the website , what they want, any changes required, how website can be improved.

6/26/2012

Web Technology

120

Difference between IIS and Apache


1.

1. Apache is free while IIS is packaged with Windows.

2. IIS only runs on Windows while Apache can run on almost any OS
including UNIX, Apples OS X, and on most Linux Distributions. 3. ASPX runs only in IIS. 4. IIS has a dedicated staff to answer most problems while support for Apache comes from the community itself. 5. IIS is optimized for Windows because they are from the same company. 6. The Windows OS is prone to security risks.
2.

7. Apache integrates with open-source technologies, such as Perl and


Python, while IIS was specifically designed for Microsofts ASP.

6/26/2012

Web Technology

121

Difference between Netscape Navigator and IE


1. 2.

3.

IE comes preinstalled with Windows , whereas Netscape not. In Netscape Navigator, you can create a single bookmark for many websites. Netscape composer, has the 'Publish' button. You can use this feature to publish all the webpages you have created all at once.

Exclusive Tags to IE
1. <bgsound> insert audio file in html page 2. <table > properties rules and frame 3. Marquee 4. <span> 5. <style> 6. <iframe> 7. <object> Inserts Java applets, OLE controls, other objects into page
6/26/2012 Web Technology 122

Exclusive tags of Netscape Navigator


1.

2. 3.

<img> tag attribute that is : lowsrc=url : Provides low-res source for faster image loading <multicol> Produces a multicolumn format. <noscript> Provides alternative information to non-JavaScript enabled browsers. <spacer> Provides whitespace objects to use in page design <textarea> wrap=style Specifies line-wrapping options for textareas in forms <ul> type=bullet Specifies bullet style for unordered lists

6/26/2012

Web Technology

123

You might also like