Your pages could be rewritten to substitute your customers' names with "Dummy." Or, private information could be intercepted and sent to a secret depository for later use. What can we do about this? There are many ways by which a hacker may attack or take control of a URL's. I am focusing this discussion on attacks that come via form input. That is, anywhere you have input coming in from your internet user, e.g. a registration form, user login or even a search on your site.
Scripts could be sent to your server by entering < script> some malicious code < /script> in your input fields. The following are steps you can take to reduce the risk of this happening. These measures will not make your site hacker-proof (no site can be if a hacker really has it in for you), but it can make it less of an easy target.
Step 1: Place character limits on your inputs You do this by adding the "maxlength" attribute into your text input tags e.g. < input type="text" name="firstname" maxlength="15"> The example above restricts the user to a 15 character input for that field. The "< script>" and "< /script>" tags alone will take 17 characters so the smaller you limit your "maxlength" attribute to, the harder it will be to include rogue codes in your inputs. Of course, you must ensure that you impose a suitable limit so that actual input from your valid users will not be excluded.
Step 2: Filtering your data All data received from your site should be filtered, you can either filter your data when it comes into your server as user input, or when it goes out as results for your user's browser. Whether you should filter input or output, depends on your site and its requirements, there is a good discussion on this at http://www.cert.org ech_tips/malicious_code_mitigation.html/ . Filters can be written in any language. Step 3: Setting the character encoding Some HTML editors already set this while it creates a page, but those of you who have older HTML editors or like me, like to code the page from scratch will need to include the following line in our HTML pages: < META http-equiv="Content-Type" content="text/html; charset=IS0-8859-1"> It should go as high as possible on your webpage, I normally place it just after the < /head> tag, before the < title> tag. This META tag tells the browser to use the "ISO-8859-1" character set, which is suitable for most Western European languages, rather than let the browser choose it's own character encoding, which may or may not be ISO-8859-1. Why is it important to explicitly set it? The character encoding basically tells browsers how to display a particular character. For example, in the ISO-8859-1character set, "A" represents the letter "A" while "©" represents the copyright symbol "©" (You can try this out by typing < p>A< /p> or < p>©< /p> in a html file then call it up on a browser). Some character sets, have more than one representation for special characters such as "", so your filter program may not toss out all the representations of the character you have asked it to exclude. So when it serves a new page back to the browser, the browser, because it has not been told what encoding to use, can still read the malicious script intact.