Jump to content
MakeWebGames

[FAQ] PHP and HTML documents standards compliance


mdshare

Recommended Posts

Although this is a PHP/MySQL board, I cannot stress enough of the importance of familiarizing yourselves with the current and up-to-date HTML standards. Some Wrox PHP books are written poorly in terms of addressing HTML standards compliance. And I have seen many many examples of poorly formed HTML documents in the posts in this and the other PHP forums. So I put together this FAQ to address how a PHP programmer *should* utilize the emerging HTML standards.

Why:

As professional programmers (or aspiring beginners) we are expected by nearly everyone in our professional lives to be knowledgeable and savvy of not only today's standards but those emerging on the horizon. Clients expect not only scalable, cost-effective and user-friendly interfaces but also a site that will function on tomorrow's browsers and be scalable with regards to emerging technologies and the standards dictated by the issuing authority designed to make those technologies function.

Making use of PHP's ability to be ambiguous when it comes to HTML design is the first step toward achieving this goal of scalability. Commonly-used mark-up is wrapped in nice user-defined functions or defined in constants or variables. The goal being a template able to conform dynamically to any user agent or be updated to fit any emerging standard future and past. This is especially vital in a large project in which updating thousands of pages to conform to a new standard would be costly in terms of both time and money and not to mention the programmer's sanity.

It is also especially vital to pick up on HTML standards early on. In earlier versions of HTML mark-up was designed very loosely with regards to case-insensitive tags, optional quoting of attribute values, single-tagged objects. But the latest and future versions are challenging this inherent looseness more and more. Things like XML and XHTML won't function on this loose-lipped syntax. Even though HTML is not really a language but a method of describing how a page looks it is closely tied to the world of real-language programming, delimiting a value with quotes now not only makes sense in programming terms but also allows HTML to fit in more with its cousins, PHP, ASP, JavaScript... etc.

The HTML authority:

Like PHP there is a group of people assembled to discuss, plan and create what HTML standards are. The difference between PHP and HTML (besides the obvious) is this group of people does not excerpt complete control over the language, but merely makes recommendations as to how it should be designed, implemented and utilized to the gigantic companies and their proprietary technologies that make use of it. The first stop for any html/css-related question should be the W3C (World Wide Web Consortium) website.

http://www.w3c.org/

http://www.w3c.org/MarkUp/

Good practice, how to write standards compliant HTML:

Document Type Declaration:

First there is the Doctype setting. It is a single statement issued at the beginning of an HTML document before the opening HTML tag. It dictates to the browser what method of 'looseness' to allow in the proceeding HTML code. This is new to HTML and should accompany any well-formed HTML document. Different settings do different things. The strict setting does not allow the use of deprecated tags (

, <font>, etc), it corrects many of the inconsistencies between the different browsers with regards to how positioning and size are rendered. Albeit there are still minor (and very annoying) inconsistencies but the setting does help.

http://www.w3.org/TR/REC-html40/struct/global.html

Case-sensitive tags:

The looseness of HTML and the quickly emerging XML created somewhat of a clash when it came to marrying the two. For this reason the W3C has made the recommendation that tags be case-sensitive. Again, when it comes to programming this seems the only natural progression and path of evolution for mark-up to take. For that reason tags and attributes should be written in all lowercase letters which will avoid namespace problems when and if a transition to XHTML or XML is made. HTML is rapidly moving toward becoming intertwined with XML, this is where XHTML comes from and this will be the HTML of tomorrow.

See also:

http://www.w3.org/TR/xhtml1/#h-4.2

Quoted values:

All HTML attribute values should be delimited with quotations.

<FORM METHOD=POST ACTION=some_page.php>

 

Should now be:

<form method="post" action="some_page.php">

or

<form method='post' action='some_page.php'>

 

Single or double quoting doesn't matter. And when used within the context of PHP the one should be used that would eliminate the need to escape the quote.

See also:

http://www.w3.org/TR/xhtml1/#h-4.4

Single name attributes:

Scattered throughout HTML are a few oddball single name attributes. This is called attribute minimization. Attribute minimization has also been deprecated.

 

<INPUT TYPE=CHECKBOX NAME=my_box CHECKED>

 

Should now be:

 

<input type="checkbox" name="my_box" checked="checked" />

-or-

<input type="checkbox" name="my_box" checked="checked">Some checkbox text</input>

 

checked="checked" is back-wards compatible and XHTML compliant.

Ending a tag with '... />' is the XHTML compliant method of closing a tag, since XHTML requires both an opening and closing tag this syntax is provided as a shortcut when only one tag is required. As long as the tag is written as '

' with a space in between the last letter and forward slash, this method is also backwards compatible.

See also:

http://www.w3.org/TR/xhtml1/#h-4.5

http://www.w3.org/TR/xhtml1/#guidelines

Other examples of transitioning from the old method to the standards compliant method:

 

<select name=my_select multiple>
   <option value=option1>option1
</select>

 

Should really be written:

 

<select name="my_select" multiple="multiple">
   <option value="option1">option1</option>
</select>

 

 

<font face=Arial size=10>Some formatted text</font>

 

This tag is now deprecated and should now be translated to a CSS equivilent:

<span style="font-family: Arial; font-size: 10pt;">Some formatted text</span>
<p style='font-family: Arial; font-size: 10pt;'>
... 
</p>

Style sheets are a much easier method of declaring fonts as well as a smörgåsbord of other visual elements.. borders, margins, padding, positioning, etc.

is also a deprecated element and should be translated to a CSS equivalent:

<span style="margin-left: 5px;"></span>

 

is now

All single name tags should use the slash method to close the tag.

See also:

http://www.w3.org/TR/xhtml1/#h-4.6

An example of a standards-compliant image tag:

[img=some_picture.jpg]

Alt text is required to accommodate people with visual disabilities and is now part of the HTML 4.01 standard.

Argument separators:

URL argument separators should also be standards compliant, the W3C recommends that a semi-colon instead of an ampersand be used to separate URL embedded arguments. PHP.net recommends that & be used (which does not require a php.ini change).

The url argument separator is a directive that can be set in php.ini and can be made to either recommendation.

[snip from php.ini]

; The separator used in PHP generated URLs to separate arguments.

; Default is "&".

;arg_separator.output = "&"

; List of separator(s) used by PHP to parse input URLs into variables.

; Default is "&".

; NOTE: Every character in this directive is considered as separator!

;arg_separator.input = ";&"

See also: http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2

See also: http://www.php.net/manual/en/function.urlencode.php

And: http://www.w3.org/TR/xhtml1/#C_12

XML Compliant Delimiters:

The W3C also has recommendations for the use of SERVER-side language delimiters.

<?php ?> - Is XML compliant, gauranteed to be portable and is the preferred method.

<? ?> - (short tags) NOT XML compliant, not gauranteed to be portable as this can be deactivated in php.ini

<% %> - Asp style delimiters, not XML compliant, and again not gauranteed to be portable as this is another php.ini setting (off by default).

<script language="php"></script> - (JavaScript style delimiters) NOT XML compliant but is gauranteed to be portable.

See also: http://www.php.net/manual/en/language.basic-syntax.php

Further resources:

http://www.w3.org/TR/html401

http://www.w3.org/TR/xhtml1

http://www.w3c.org/XML

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...