.net, C#, Code, Microsoft, Standards, Technical, XML

Strip Illegal XML Characters based on W3C standard

W3C has defined a set of illegal characters for use in XML . You can find info about the same here:

XML 1.0 | XML 1.1

Here is a function to remove these characters from a specified XML file:

using System;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

namespace XMLUtils
{
    class Standards
    {
        /// <summary>
        /// Strips non-printable ascii characters 
        /// Refer to http://www.w3.org/TR/xml11/#charsets for XML 1.1
        /// Refer to http://www.w3.org/TR/2006/REC-xml-20060816/#charsets for XML 1.0
        /// </summary>
        /// <param name="filePath">Full path to the File</param>
        /// <param name="XMLVersion">XML Specification to use. Can be 1.0 or 1.1</param>
        private void StripIllegalXMLChars(string filePath, string XMLVersion)
        {
            //Remove illegal character sequences
            string tmpContents = File.ReadAllText(filePath, Encoding.UTF8);

            string pattern = String.Empty;
            switch (XMLVersion)
            {
                case "1.0":
                    pattern = @"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|7F|8[0-46-9A-F]9[0-9A-F])";
                    break;
                case "1.1":
                    pattern = @"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|[19][0-9A-F]|7F|8[0-46-9A-F]|0?[1-8BCEF])";
                    break;
                default:
                    throw new Exception("Error: Invalid XML Version!");
            }

            Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
            if (regex.IsMatch(tmpContents))
            {
                tmpContents = regex.Replace(tmpContents, String.Empty);
                File.WriteAllText(filePath, tmpContents, Encoding.UTF8);
            }
            tmpContents = string.Empty;
        }
    }
}
Advertisements
.net, C#, Code, Microsoft, Security, Technical

Quickly calculate and compare MD5 or SHA2 values

There might be situations when you might want to quickly calculate the MD5 or SHA2  values of any file or a set of files. The solution depends on what is your requirement: 1) Its a one time work and you would just want the results in a text / html file 2) You want the hash inside your .Net program

Scenario 1: The best bet is to download a program called HashMyFiles by nirsoft.net
This nifty little program can generate an HTML report containing the hashes of all the files in the specified file/folder.

Scenario 2: Here is the C# version of creating the hashes using the build in Cryptography classes.

using System.Security.Cryptography;
using System.Text;

public string GetMD5(string filePath)
{
            StringBuilder sb = new StringBuilder();
            FileStream fs = new FileStream(file, FileMode.Open);
            MD5 md5 = new MD5CryptoServiceProvider();
            byte[] hash = md5.ComputeHash(fs);
            fs.Close();
            fs.Dispose();
            foreach (byte hex in hash)
            {
                //Returns hash in lower case.
                //To return upper case change “x2” to “X2”
                sb.Append(hex.ToString(“x2”));
            }
            return sb.ToString();
}

public string GetSHA2(string filePath)
{
            StringBuilder sb = new StringBuilder();
            FileStream fs = new FileStream(file, FileMode.Open);
            SHA256 s2 = new SHA256CryptoServiceProvider();
            byte[] hash = s2.ComputeHash(fs);
            fs.Close();
            fs.Dispose();
            foreach (byte hex in hash)
            {
                //Returns hash in lower case.
                //To return upper case change “x2” to “X2”
                sb.Append(hex.ToString(“x2”));
            }
            return sb.ToString();
}