计算机网络(三)

发布于 作者: Ethan

What Does the Web Consist Of?

  • Who uses it?
  • Who provides the content?
  • How do they communicate?
  • How do we find the content?
  • How is the content organized?
  • How is it displayed?

Web Components

  • Infrastructure: Clients, Servers (DNS, CDN, Datacenters)

  • Content:

    • URL: naming content
    • HTML: formatting content
  • Protocol for exchanging information: HTTP


Why Is There Nothing About the Network?

  • Clients, servers, and routers operate at multiple layers:

    • Transport
    • Network
    • Datalink
    • Physical
  • Web protocols (e.g., HTTP) exist at the Application layer, abstracting away lower layers.


What We Want

Example:

  • URL: http://123.xyz
  • Request: HTTP Request → GET /index.html
  • Response: HTTP Response with content

URL: Uniform Resource Locator

Format:

protocol://host-name[:port]/directory-path/resource
  • Protocol: http, ftp, https, smtp, rtsp, etc.
  • Host-name: DNS name or IP address
  • Port: defaults to protocol’s standard port (HTTP: 80, HTTPS: 443)
  • Directory path: hierarchical, reflecting file system
  • Resource: identifies the desired resource

Examples:

  • File system: https://github.com/eecs489staff/slides/blob/main/04-HTTPandWeb.pptx
  • Program execution: https://www.google.com/search?q=eecs489

Hyper Text Transfer Protocol (HTTP)

  • Client-server architecture

    • Server is “always on” and “well-known”
    • Clients initiate contact
  • Request/reply protocol (synchronous)

  • Runs over TCP, Port 80

  • Stateless

  • ASCII format (before HTTP/2)


Example: HTML

<!DOCTYPE html>
<html>
<body>

<h1>My First Heading</h1>
<p>My first paragraph.</p>

</body>
</html>

Client-Server Interaction – HTTP/0.9

  1. Establish TCP connection
  2. Client request
  3. Server response
  4. Close connection

Result: return HTML file.


Performance Goals

  • User: fast downloads, high availability
  • Content provider: happy users, cost-effective infrastructure
  • Network: avoid overload

Solutions

  • Improve networking protocols (HTTP, TCP, etc.)
  • Caching and replication
  • Exploit economies of scale (e.g., webhosting, CDNs, datacenters)

HTTP Performance

  • Most web pages = multiple objects (HTML + images, CSS, JS, etc.)
  • Naive retrieval: one item at a time
  • New TCP connection for each small object → inefficient

Object Request Response Time

  • RTT (Round-Trip Time): time for packet to travel client ↔ server
  • Response time = 2 RTT + Transmission Time

Non-Persistent Connections

  • Default in HTTP/1.0
  • Each object requires 2 RTT + Δ
  • Very inefficient

The Web: History

  • HTTP/1.1 (1997)

    • Persistent connections (multiple requests/responses per connection)
    • Performance & security improvements
  • HTTP/2 (2015)

    • Multiplexing (multiple requests/responses concurrently)
    • Binary protocol
    • Server push
  • HTTP/3 (2022)

    • Built on QUIC over UDP
    • Solves head-of-line blocking

Techniques for Improving Performance

Concurrent Requests

  • Multiple parallel connections
  • Client & provider benefit
  • Network load increases

Persistent Connections

  • Maintain TCP connection across multiple requests
  • Avoid setup/teardown overhead
  • Default in HTTP/1.1

Pipelined Requests & Responses

  • Multiple requests sent in batches
  • FIFO responses → head-of-line blocking issue
  • Priority & preemption needed

Scorecard

  • n small objects:

    • One-at-a-time: ~2n RTT
    • Concurrent (m): ~2[n/m] RTT
    • Persistent: ~(n+1) RTT
    • Pipelined: ~2 RTT
  • n large objects (size F):

    • Dominated by throughput BC
    • One-at-a-time: nF/BC
    • Concurrent (m): nF/(mBC), if mBC ≤ BL
    • Pipelined/persistent: nF/BC

Caching

Why?

  • Exploits locality of reference
  • Highly effective, though limited by unique requests

How?

  • If-modified-since header

  • Response headers:

    • Expires
    • No-cache

Where?

  • Client/browser
  • Forward proxies (near clients)
  • Reverse proxies (near servers)
  • CDNs

HTTP Methods (HTTP/1.1)

  • GET, HEAD
  • POST: send info (e.g., forms)
  • PUT: upload file
  • DELETE: delete file

Client-to-Server Communication

HTTP Request Message

  • Request line: method, resource, version
  • Headers: metadata (e.g., Host, User-Agent)
  • Body: optional (e.g., form data)

Example:

GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
User-agent: Mozilla/4.0
Connection: close
Accept-language: fr

Server-to-Client Communication

HTTP Response Message

  • Status line: version, status code, phrase
  • Headers: metadata (e.g., Content-Type, Date)
  • Body: data

Example:

HTTP/1.1 200 OK
Connection: close
Date: Thu, 06 Jan 2017 12:00:15 GMT
Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Jun 2006 ...
Content-Length: 6821
Content-Type: text/html

<data>

HTTP Is Stateless

  • Each request-response independent
  • Advantages: scalability, easy failure handling, high request rate
  • Disadvantages: some applications need state (e.g., shopping cart)

State in Stateless Protocols: Cookies

  • Client stores small state for server
  • Sent with future requests
  • Can provide authentication

Example:

Set-Cookie: XYZ
Cookie: XYZ

Beyond Cookies

  • Marketing and tracking concerns
  • Example: FLoC (Federated Learning of Cohorts) in Google Chrome

Summary

  • Persistence: reuse TCP connections
  • Pipelining: batch requests, ordered responses
  • Concurrent requests: multiple TCP connections
  • Multiplexing: many streams, fully interleaved
  • HTTP/1.1: text-based → replaced by HTTP/2 (binary) → HTTP/3 (QUIC/UDP)
  • Performance improvements: pipelining, batching, caching, CDNs, datacenters