Russian Forms handling

In your HTML document:

  1. Use ACCEPT-CHARSET attribute with <FORM> tag as I18N draft says, it must contain comma separated list of character sets acceptable by server (in Accept-Charset header field format but without any q= quality parameters, see How to request KOI8-R documents).

  2. Use POST method, it is impossible to determine character set for GET method arguments.

  3. ACCEPT-CHARSET attribute affects all <INPUT> and <TEXTAREA> elements of the <FORM>. If you want different character set for each element, you must use ENCTYPE=multipart/form-data form, check Form-based File Upload in HTML (RFC 1867) for more info.

For example:

<FORM METHOD=POST ACCEPT-CHARSET="koi8-r, us-ascii" ACTION="cgi-bin/guestbook.cgi">

In your CGI script:

  1. Correct browser must supply charset=name attribute in Content-Type header field. For example:

    Content-Type: application/x-www-form-urlencoded; charset=KOI8-R

    Value of this header field is accessible in CGI script via CONTENT_TYPE environment variable. You can check how your browser do it using <FORM> input test. If a character set is present there, extract it and pass as an argument to your external document character sets converter.

    Another standard variant is using ENCTYPE=multipart/form-data, but in this case your browser must accompany each part of multipart message with correct charset=name in Content-Type field. I don't know any browser which do it, so try to avoid this ENCTYPE.

  2. If character set not present, assume it is the same as you specify in ACCEPT-CHARSET <FORM> attribute and pass it to character sets converter.
    WARNING: it works only for single character set in ACCEPT-CHARSET attrubute or compatible subset of them.

Return to Main Page