Sunday, December 02, 2012

How to store millions of Images in best possible way in

I have requirement in which i want to create a architecture to store millions of images in best possible way.

Requirement 1:
Each Image have its Unique name and Path by generating MD5 Hash code for Image.  By generating MD5 Hash code duplicating of Image can be avoided.
Example: If 100 user upload the exact same image, you will only keep one copy of that image on your filesystem instead of 100 copies.
Example of MD5 Hashcode from Image: 8b68925628110278bf194cbe5e071654

Requirement 2:
Making image storing pattern such that to keep directory sizes manageable.
Example of directory structure form from Image Hash Code.

Please note: This article only discuss core concept about how to generate hash code for image and how to store image by creating proper directory structure while storing on file system.  You can add lot on top of this but this will be a good starting point.

protected void btnUploadImage_Click(object sender, EventArgs e)

private void UploadImage()
            if (!(FileUpload1.HasFile))
                //Image Is Invalid
                //Reason: Either of following
                //1) Non-supported Image File Format
                //2) No Image File Found
                //3) Invalid image (Eg: Text file is renamed to .jpg Image)
                //Display appropriate message to user               
                //Please note I have NOT added code for how to validate image
                //In order to keep article focus on core concept
                //Valid Image
                string ImgHashCode = Image2Md5Hash();
                lblImageHashCode.Text = ImgHashCode;

                //Todo: Write Code to Save HashCode.filetype in database

                //Save File on FileSystem

Answer for Requirement 1:
//Image to MD5 Hash Code - Generating Hash Code for Image
private string Image2Md5Hash()
            const int BUFFER_SIZE = 255;
            Byte[] Buffer = new Byte[BUFFER_SIZE];

            Stream theStream = FileUpload1.PostedFile.InputStream;
            int nBytesRead = theStream.Read(Buffer, 0, BUFFER_SIZE);
                        return CalculateMD5(theStream);

private static byte[] _emptyBuffer = new byte[0];

public static string CalculateMD5(Stream stream)
            return CalculateMD5(stream, 64 * 1024);

public static string CalculateMD5(Stream stream, int bufferSize)
            MD5 md5Hasher = MD5.Create();

            byte[] buffer = new byte[bufferSize];
            int readBytes;

            while ((readBytes = stream.Read(buffer, 0, bufferSize)) > 0)
                md5Hasher.TransformBlock(buffer, 0, readBytes, buffer, 0);

            md5Hasher.TransformFinalBlock(_emptyBuffer, 0, 0);

            var sb = new StringBuilder();
            foreach (byte b in md5Hasher.Hash)
            return sb.ToString();           

Answer for Requirement 2:
private void SaveImageonFileSystem(string ImgHashCode)

                string RootDirPath = "userimages";
                string ImageSize = "orig";
                string FirstFolder = ImgHashCode.Substring(0,1);
                string SecondFolder = ImgHashCode.Substring(1,2);
                string ThirdFolder = ImgHashCode.Substring(3, 3);

                string DirectoryName = Server.MapPath("~")
                                        + RootDirPath + "\\"
                                        + ImageSize + "\\"
                                        + FirstFolder + "\\"
                                        + SecondFolder + "\\"
                                        + ThirdFolder + "\\";

                if (!Directory.Exists(DirectoryName))

                FileUpload1.SaveAs(DirectoryName + ImgHashCode + Path.GetExtension(FileUpload1.FileName));

Related Post
How to identify whether uploaded image is valid or not
How to remove image from cache

No comments:

Most Recent Post

Community Updates

Subscribe Blog via Email

Enter your email address:

Disclaimers:We have tried hard to provide accurate information, as a user, you agree that you bear sole responsibility for your own decisions to use any programs, documents, source code, tips, articles or any other information provided on this Blog.
Page copy protected against web site content infringement by Copyscape