SLIDE 1
Gholamhossein Tavasoli ZNU – Fall 2013
SLIDE 2 Definitions
- A computer, responsible for accepting HTTP requests
from clients, and serving them Web pages
- A computer program that provides the above
mentioned functionality
Common features
- Accepting HTTP requests from the network
- Providing HTTP response to the requester
▪ Typically consists of an HTML
- Usually capable of logging
▪ Client requests/Server responses
SLIDE 3 Returned content
▪ Comes from an existing file
▪ Dynamically generated by some other program/script called by the Web server.
Path translation
- Translate the path component of a URL into a
local file system resource
▪ Path specified by the client is relative to the server’s root dir
SLIDE 4
Created to define the communication between a
web server and a client
It's the network protocol used to deliver virtually
all files and other data (collectively called resources) on the World Wide Web
A browser is an HTTP client because it sends
requests to an HTTP server (Web server), which then sends responses back to the client.
The standard (and default) port for HTTP servers
to listen on is 80, though they can use any port.
SLIDE 5 Like most network protocols, HTTP uses the
client‐server model: An HTTP client opens a connection and sends a request message to an HTTP server; the server then returns a response message, usually containing the resource that was requested
connection
- GET <file location>
- Display response
- Close connection
- OK
- Send page or error message
- OK
Client Server
SLIDE 6 Format of a http message:
<initial line, different for request vs. response> Header1: value1 Header2: value2 Header3: value3 <optional message body goes here, like file contents
- r query data; it can be many lines long, or even
binary data >
SLIDE 7
A typical initial request line:
- GET /path/to/file/index.html HTTP/1.0
Initial response line:
- HTTP/1.0 200 OK
- HTTP/1.0 404 Not Found
Status code:
- 1xx indicates an informational message only
- 2xx indicates success of some kind
- 3xx redirects the client to another URL
- 4xx indicates an error on the client's part
- 5xx indicates an error on the server's part
Common status codes:
- 200 OK
- 404 Not Found
- 301 Moved Permanently
- 302 Moved Temporarily
- 303 See Other (HTTP 1.1 only)
- 500 Server Error
SLIDE 8 Typical request headers:
- From: email address of requester
- User‐Agent: for example User‐
agent: Mozilla/3.0Gold
Typical response headers:
- Server: for example Server: Apache/1.2b3‐dev
- Last‐modified: fro example Last‐Modified: , 19
Feb 2006 23:59:59 GMT
SLIDE 9
SLIDE 10
In a response, this is where the requested resource is
returned to the client (the most common use of the message body), or perhaps explanatory text if there's an error.
In a request, this is where user‐entered data or
uploaded files are sent to the server.
If an HTTP message includes a body, there are usually
header lines in the message that describe the body. In particular,
The Content‐Type: header gives the MIME‐type of the
data in the body, such as text/html or image/gif.
The Content‐Length: header gives the number of
bytes in the body.
SLIDE 11
To retrieve the file at the URL
http://www.sourceforge.net
Request:
SLIDE 12
To retrieve the file at the URL
http://www.sourceforge.net
Response:
SLIDE 13 GET: request a resource by URL HEAD
- is just like a GET request, except it asks the server
to return the response headers only, and not the actual resource (i.e. no message body).
- This is useful to check characteristics of a resource
without actually downloading it, thus saving bandwidth.
SLIDE 14 POST
- A POST request is used to send data to the server to
be processed in some way, like by a PHP script.
- There's a block of data sent with the request, in the
message body. There are usually extra headers to describe this message body, like Content‐Type: and Content‐Length:.
- The request URI is not a resource to retrieve; it's
usually a program to handle the data you're sending.
- The HTTP response is normally program output, not a
static file.
PUT, DELETE, TRACE, CONNECT, OPTIONS
SLIDE 15
A multithreaded Web server with a front end
and processing modules.
SLIDE 16 each processing module performs a series of
steps:
- Resolve the name of the Web page requested.
- Authenticate the client.
- Perform access control on the client.
- Perform access control on the Web page.
- Check the cache.
- Fetch the requested page from disk.
- Determine the MIME type to include in the response.
- Take care of miscellaneous odds and ends.
- Return the reply to the client.
- Make an entry in the server log.
SLIDE 17 Berners‐Lee wrote two programs
- A browser called WorldWideWeb
- The world’s first Web server, which ran on
NeXSTEP
▪ The machine is on exhibition at CERN’s public museum
SLIDE 18
Apache HTTP Server, Apache Software
Foundation
Internet Information Services (IIS),
Microsoft
NGINX, Nginx, Inc. LiteSpeed, LiteSpeedTechnologies, Inc. Lighttpd
SLIDE 19
SLIDE 20
SLIDE 21
SLIDE 22
SLIDE 23
SLIDE 24
SLIDE 25
SLIDE 26
Apache is used by 61.9% of all the websites
whose web server we know.
SLIDE 27
Microsoft‐IIS is used by 16.5% of all the
websites whose web server we know.
SLIDE 28
Nginx is used by 16.1% of all the websites
whose web server we know.
SLIDE 29
LiteSpeed is used by 1.9% of all the websites
whose web server we know.
SLIDE 30
Apache 1
SLIDE 31
Apache 2
SLIDE 32
NGINX
SLIDE 33 Caching Content negotiation
- A resource may be available in several different
representations.
- For example, it might be available in different languages
- r different media types, or a combination.
- One way of selecting the most appropriate choice is to
give the user an index page, and let them select.
- However it is often possible for the server to choose
automatically by the help of request headers: Accept‐Language: fr; q=1.0, en; q=0.5 Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1
SLIDE 34 Log files
- In order to effectively manage a web server, it is
necessary to get feedback about the activity and performance of the server as well as any problems that may be occurring
SLIDE 35 Error log:
- [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1]
client denied by server configuration: /export/home/live/ap/htdocs/test
Access log:
▪ 127.0.0.1 ‐ frank [10/Oct/2000:13:55:36 ‐0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
▪ 127.0.0.1 ‐ frank [10/Oct/2000:13:55:36 ‐0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
SLIDE 36 Mapping URLs to file system locations:
- DocumentRoot
- Alias directive:
▪ Alias /docs /var/web ▪ the URL http://www.example.com/docs/dir/file.html will be served from /var/web/dir/file.html.
▪ ScriptAliasMatch ^/~([a‐zA‐Z0‐9]+)/cgi‐bin/(.+) /home/$1/cgi‐ bin/$2 ▪ will map a request to http://example.com/~user/cgi‐ bin/script.cgi to the path /home/user/cgi‐bin/script.cgi and will treat the resulting file as a CGI script
▪ http://www.example.com/~user/file.html
SLIDE 37 Mapping URLs to file system locations:
▪ Redirect permanent /foo/ http://www.example.com/bar/
▪ Apache also allows you to bring remote documents into the URL space of the local server. ▪ This technique is called reverse proxying because the web server acts like a proxy server by fetching the documents from a remote server and returning them to the client. ▪ ProxyPass /foo/ http://internal.example.com/bar/
- Mod_speling for file not found errors
SLIDE 38 Access control to filesystem Virtual hosting
- The term Virtual Host refers to the practice of
running more than one web site (such as www.company1.com and www.company2.com)
SLIDE 39 Virtual hosts can be "IP‐based", meaning that
you have a different IP address for every web site
- The server must have a different IP address for
each IP‐based virtual host.
- Most commonly, this is used to serve different
websites on different ports or interfaces.
- This can be achieved by the machine having
several physical network connections, or by use of virtual interfaces which are supported by most modern operating systems
SLIDE 40 Virtual hosts can be "IP‐based", meaning that
you have a different IP address for every web site
- The server must have a different IP address for
each IP‐based virtual host.
- Most commonly, this is used to serve different
websites on different ports or interfaces.
- This can be achieved by the machine having
several physical network connections, or by use of virtual interfaces which are supported by most modern operating systems
SLIDE 41 or "name‐based", meaning that you have
multiple names running on each IP address. The fact that they are running on the same physical server is not apparent to the end user.