Wednesday, 8 September 2010


Introduction

CAPTCHA stands for "completely automated public Turing test to tell computers and humans apart." What it means is, a program that can tell humans from machines using some type of generated test. A test most people can easily pass but a computer program cannot.

You've probably encountered such tests when signing up for an online email or forum account. The form might include an image of distorted text, like that seen in the sample screen shot, which you are required to type into a text field.


The idea is to prevent spammers from using web bots to automatically post form data in order to create email accounts (for sending spam) or to submit feedback comments or guestbook entries containing spam messages. The text in the image is usually distorted to prevent the use of OCR (optical character reader) software to defeat the process. Hotmail, PayPal, Yahoo and a number of blog sites have employed this technique.

This article demonstrates how to create such an image and employ it within an ASP.NET web form.

Background

You can find more information on CAPTCHA at The CAPTCHA Project and read about its use in foiling purveyors of pills, pr0n and pyramid-schemes in an article from Scientific American entitled Baffling the Bots.

Before using this technique however, you should consider how it will affect your site's accessibility to the blind and other visually impaired visitors. PayPal attempts to address this problem on their sign up form by including a link to an audio file, in which a voice spells out the image text.

The code presented here produces only an image. But, if you had code to generate an audio file, you could easily integrate it.

Using the code

The source code zip file contains the source for one class and two web forms. To use it, just create a new web project and add those items.

Files

  • CaptchaImage.cs - defines the CapchaImage object which actually creates the image.
  • Default.aspx, Default.aspx.cs - a sample web form.
  • JpegImage.aspx, JpegImage.aspx.cs - a web form designed to output a JPEG image rather than HTML.

Let's look at each component and it's purpose.

CaptchaImage.cs

The CaptchaImage object creates an image given parameters for the text to be displayed, the image dimensions and, optionally, the font to use.

The heart of the code is the GenerateImage() method, shown below, which creates a bitmap image of the specified width and height. This method is called from the CaptchaImage constructor, so the image is ready as soon as you create a new instance of the object.

To create the image, we first fill in the background using a hatched brush (the "dirtier" the image appears, the harder it is for an OCR program to read it).

To make the text fit within the image, we start with an initial font size based on the image height and use the Graphics.MeasureString() method to find the resulting dimensions of the drawn text. If the text exceeds the image dimensions, we reduce the font size and test again and again until a suitable font size is found.

// ==================================================================== 
// Creates the bitmap image. 
// ==================================================================== 
private void GenerateImage() 
{   
// Create a new 32-bit bitmap image.   
Bitmap bitmap = new Bitmap(     this.width,     this.height,     PixelFormat.Format32bppArgb);   
 // Create a graphics object for drawing.  
 Graphics g = Graphics.FromImage(bitmap);   
g.SmoothingMode = SmoothingMode.AntiAlias;   
Rectangle rect = new Rectangle(0, 0, this.width, this.height);    
// Fill in the background.   
HatchBrush hatchBrush = new HatchBrush(HatchStyle.SmallConfetti,Color.LightGray,Color.White); 
g.FillRectangle(hatchBrush, rect);    
// Set up the text font.   SizeF size;   
float fontSize = rect.Height + 1;   Font font;   
// Adjust the font size until the text fits within the image.   
do   
{     
fontSize--;
font = new Font(this.familyName,fontSize,FontStyle.Bold); 
    size = g.MeasureString(this.text, font);   
} 
while (size.Width > rect.Width);    
// Set up the text format.   
StringFormat format = new StringFormat();   
format.Alignment = StringAlignment.Center;   
format.LineAlignment = StringAlignment.Center;   
 // Create a path using the text and warp it randomly.  
 GraphicsPath path = new GraphicsPath();   
path.AddStringhis.text,font.FontFamily,(int) font.Style,font.Size, rect,format);
float v = 4F;   
PointF[] points =   {     
new PointF(this.random.Next(rect.Width) / v,this.random.Next(rect.Height) / v),
new PointF(rect.Width - this.random.Next(rect.Width) / v, this.random.Next(rect.Height) / v),
new PointF(this.random.Next(rect.Width) / v,rect.Height - this.random.Next(rect.Height) / v),
new PointF(rect.Width - this.random.Next(rect.Width) / v,rect.Height - this.random.Next(rect.Height) / v)   
};   
Matrix matrix = new Matrix();   
matrix.Translate(0F, 0F);   
path.Warp(points, rect, matrix, WarpMode.Perspective, 0F);    
// Draw the text.   
hatchBrush = new HatchBrush(HatchStyle.LargeConfetti,Color.LightGray,Color.DarkGray);
g.FillPath(hatchBrush, path);    
// Add some random noise.   
int m = Math.Max(rect.Width, rect.Height);
for (int i = 0; i < (int) (rect.Width * rect.Height / 30F); i++)   
{
int x = this.random.Next(rect.Width);
int y = this.random.Next(rect.Height);
int w = this.random.Next(m / 50);
int h = this.random.Next(m / 50);
g.FillEllipse(hatchBrush, x, y, w, h);   
}    
// Clean up.   font.Dispose();  
 hatchBrush.Dispose();   
g.Dispose();    
// Set the image.   
this.image = bitmap; 
} 

Once the font is set, we define a GraphicsPath() which essentially converts the text to a set of lines and curves. This can then be distorted using the GraphicsPath.Warp() method with some randomly generated values. The effect is similar to holding a cardboard sign up by opposite corners and giving it a bit of a twist. The resulting path is drawn onto the image, again using a hatch brush to give it a "dirty" appearance.

To complete the distortion, small blots are randomly painted over the image. You could experiment with other effect to thwart OCRs, but keep in mind that it should still be legible to humans, some of whom may have visual impairments

private void Page_Load(object sender, System.EventArgs e)

{
  if (!this.IsPostBack)
    // Create a random code and store it in the Session object.
    this.Session["CaptchaImageText"] = GenerateRandomCode();
  else
  {
    // On a postback, check the user input.
    if (this.CodeNumberTextBox.Text ==this.Session["CaptchaImageText"].ToString())
    {
      // Display an informational message.
      this.MessageLabel.CssClass = "info";
      this.MessageLabel.Text = "Correct!";
    }
    else
    {
      // Display an error message.
      this.MessageLabel.CssClass = "error";
      this.MessageLabel.Text = "ERROR: Incorrect, try again.";
      // Clear the input and create a new random code.
      this.CodeNumberTextBox.Text = "";
      this.Session["CaptchaImageText"] = GenerateRandomCode();
    }
  }
} 

The reason for storing the text string in the Session object is so that it can be accessed by JpegImage.aspx.

JpegImage.aspx

For this web form, no HTML is needed (what's there is just the default code generated by Visual Studio when the file was created). Instead of HTML, the code will produce a JPEG image.

In the code-behind, we first create a CaptchaImage object, using the text retrieved from the Session object. This creates a bitmap image for us.

private void Page_Load(object sender, System.EventArgs e)
{
// Create a CAPTCHA image using the text stored in the Session object.
CaptchaImage ci = new CaptchaImage(this.Session["CaptchaImageText"].ToString(),200, 50, "Century Schoolbook");
// Change the response headers to output a JPEG image.
  this.Response.Clear();
  this.Response.ContentType = "image/jpeg";
  // Write the image to the response stream in JPEG format.
  ci.Image.Save(this.Response.OutputStream, ImageFormat.Jpeg);
  // Dispose of the CAPTCHA image object.
  ci.Dispose();
} 

We then modify the HTTP response headers to set the Content-type to "image/jpeg" so the client browser will know we are sending an image.

The last step is to retrieve the bitmap image from CaptchaImage.Image and write it to the HTTP response output stream in JPEG format. Fortunately, the Save() method of the Bitmap object makes this simple. Any other supported image format could be used as well so long as the Content-type header is set accordingly.

Points of Interest

Because the CaptchaImage class contains a Bitmap object, and bitmaps employ unmanaged resources, a custom Dispose() method is implemented. This allows those unmanaged resources to be freed whenever a CaptchaImage is destroyed.

Downloads

  1. captcha_image class file C#
  2. captcha_image class file VB

No comments:

Post a Comment