wiki:soc/2007/cgi

Version 12 (modified by Darren Garvey, 15 years ago) ( diff )

--

About

Below is a partial list of things this project aims to provide:

  • Implement the Controller part of the model-view-controller idiom for any CGI-compatible protocols.
  • Simple access to standard CGI environment variables and input data.
  • Clean access to request meta-data (ie. 'environment vars') and input data using alternative protocols, such as FastCGI.
  • Asynchronous read/write support.
  • A clean way to write to clients without knowledge of the underlying protocol in use.
  • Minimal process initialisation time: most of the time, clients will be handled by multiple process images, so each time one is started up, there can't be a noticeable load-time before handling the first request.
  • Basic session support: by default Boost.Interprocess will be used for saving the memory, although the SessionAdapter concept should allow for other saving methods to be added later (such as files, databases or in-process memory).
  • Internationalisation support should be considered, although to what extent this can be tackled I'm not yet sure.

Usage

First, a standard CGI example:

int main()
{
  cgi::request req;   // set up the request
  cgi::reply rep; // see Design Ideas for more about this

  rep<< "Hello, " << req.param<cgi::GET>("user_name") << "!";
  rep.send(req);
  
  return 0;
}

Using an alternate protocol (FastCGI in this case) will alter the above like so:

int sub_main(cgi::request req)
{
  cgi::reply rep; // see Design Notes for more about this

  rep<< "Hello, " << req.param<cgi::POST>("user_name") << "!";
  rep.send(req);

  return 0;
}

int main()
{
  cgi::fcgi_service service(&sub_main);
  service.run();
  
  return 0;
}

Design Notes

Past discussion can be found (starting) here:
http://lists.boost.org/Archives/boost/2007/04/120191.php
http://lists.boost.org/Archives/boost/2007/04/119565.php

See Concepts for more.

Separation of cgi::request and cgi::reply:

This separation is only a recent change. The main reasoning is that equivalent meta-data exists for both the request and the reply (ie. same identifier, different value). Using getters/setters is one idea, although in a large program, there could be a situation where you set a response header and then need to check it later. If everything was done with the request object then there'd be no clean way to achieve this.

Having two objects has other advantages:

  • Code is clearer, without being too verbose
  • Response caching is easier to implement; code can just cache a cgi::reply since it holds no data relevant to the specific request (note: response caching isn't really part of this project, although cgi::session will probably provide basic facilities)

Having the CommonGatewayService control threading

See Dispatching.

Main Classes

cgi::basic_request<>

This holds the data corresponding to the request. It will be specific to a Protocol type and will be aware of how to receive, send and parse data for that Protocol. There will be typedefs for typical usage.

cgi::request

By default, this provides a general (as opposed to generic) access point to any type of request. If constructed with a service object, then the request takes a request from the queue (or blocks until one is available). Default construction initialises a standard cgi environment.

This generality is achieved using runtime linkage in a similar way to boost::any, although static linking can be forced using a choice of macros which turn cgi::request into a typedef for a particular cgi::basic_request<>.

cgi::response

This simply holds headers and the content of the response and provides various ways to write to it. Up until it is sent to the user, it is unaware of what it's a response to. This helps keep code - both library and user code - clean and explicit, without being overly verbose and aids significantly with response caching (something this library won't address for now).

cgi::session

This will provide simple session data caching.

cgi::basic_service<>

This is the main class in the library. There should be specializations for each Protocol and the underlying structure should be generic enough to allow for any type of cgi-like protocol to be 'serviced', without sacrificing efficiency, clarity of code or any of the aims stated in the Design Notes.

Important Internal Classes

cgi::basic_gateway<>

The gateway is the abstraction of the interface with the server. This can vary from just an abstraction of std::cin/cout to a fully multiplexed set of connections, which can themselves be of more than one type.

cgi::basic_request_acceptor<>

Accepts a new, possibly unloaded request. Before using the request cgi::basic_request<>::load() or cgi::basic_request<>::async_load() must be called.

Random Notes

  • The active requests queue should hold boost::weak_ptr<basic_request<> >s. That means that the request will be properly destroyed unless it's on the pending requests queue or actually being handled by user code. Note: to keep it alive during asynchronous requests, the user should be using a function object with a boost::shared_ptr<request_base> that they pass to the Handler given to the async request.
  • The standard CGI sub-library should be header-only. Ideally the other parts (eg. fcgi_service) will be either header-only or compilable, with compilable as the default: a multi-process FastCGI server pool is the most common use, so using a shared '[MVC-]controller' library is likely to be quite effective.
  • Is there a need for a boost::lexical_cast<> wrapper? Something like cgi::param_as<char,cgi::GET>("blah") or cgi::get_as<int,cgi::POST>("user_id"). Consider:
    void a()
    {
      int id = 4096;
      cgi::request req;
      if( boost::lexical_cast<std::string>(req.param<cgi::POST>("user_id")) > (id / 4) &&
          req.param_as<int,cgi::POST>("user_id") != id )
        // ...
    }
    
Note: See TracWiki for help on using the wiki.