C#, Microsoft, Technical, Tips, XML

Case insensitive XPath query

I wanted to do a case insensitive xpath lookup in my C# .Net application. There was no direct way that i could find to get the job done, but fortunately the workaround ain’t that difficult. Lets consider the following example:

<xml>
<books>
<book id=”1″ name=”Book1″ type=”fiction” />
<book id=”2″ name=”Book2″ type=”nonfiction” />
<book id=”1″ name=”Book1″ type=”FICTION” />
</books>

To request for all fiction books here is the xpath query:

“books/book[translate(@type, ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ’, ‘abcdefghijklmnopqrstuvwxyz’) =’fiction'”

Blog, Blog Help, BlogML, Wordpress, XML

BlogML importer for WordPress 2.5

I had a blog server which I felt compelled to migrate to WordPress 2.5. The existing blog server can only emit the data in the BlogML format. So I set out to find an BlogML importer for WordPress 2.5. I stumbled upon Aaron’s post on importing to wordpress. So here is what needs to be done:

  • Downloaded WordPress BlogML import module from Aarons site (blogml_wp_2.3.zip)
  • Upload a copy of the BlogML.php file to your wp-admin/import folder
  • Edit the file and change the PATH_TO_CURRENT_DIRECTORY to point to the folder where import.php exists.
  • Download the PHP XPath library 3.5 from Sourceforge
  • Upload a copy of the XPath.class.php file to your wp-admin/import folder

We are all set now. Just go to your wordpress admin page and navigate to Manage -> Import. If all is well, then you should see an entry for BlogML. Just click on the link and follow directions to import the BlogML file into your WordPress blog.

PS:  Once you click on the BlogML link on the import page if you are unable to view the “Upload file and import” button then don’t worry, just open up the BlogML.php file and modify the following lines:

// Instantiate and register the importer
include_once(‘import.php’);
if(function_exists(‘register_importer’)) {
    $blogml_import = new BlogML_Import();
    register_importer(‘blogml’, ‘BlogML’, (‘Import posts, comments, users, and categories from a BlogML file’), array ($blogml_import, ‘dispatch’));
}

to

// Instantiate and register the importer
//include_once(‘import.php’);
//if(function_exists(‘register_importer’)) {
    $blogml_import = new BlogML_Import();
    register_importer(‘blogml’, ‘BlogML’, (‘Import posts, comments, users, and categories from a BlogML file’), array ($blogml_import, ‘dispatch’));
//}

 

Many thanks to Aaron for the plugin.

.net, C#, Code, Microsoft, Standards, Technical, XML

Strip Illegal XML Characters based on W3C standard

W3C has defined a set of illegal characters for use in XML . You can find info about the same here:

XML 1.0 | XML 1.1

Here is a function to remove these characters from a specified XML file:

using System;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

namespace XMLUtils
{
    class Standards
    {
        /// <summary>
        /// Strips non-printable ascii characters 
        /// Refer to http://www.w3.org/TR/xml11/#charsets for XML 1.1
        /// Refer to http://www.w3.org/TR/2006/REC-xml-20060816/#charsets for XML 1.0
        /// </summary>
        /// <param name="filePath">Full path to the File</param>
        /// <param name="XMLVersion">XML Specification to use. Can be 1.0 or 1.1</param>
        private void StripIllegalXMLChars(string filePath, string XMLVersion)
        {
            //Remove illegal character sequences
            string tmpContents = File.ReadAllText(filePath, Encoding.UTF8);

            string pattern = String.Empty;
            switch (XMLVersion)
            {
                case "1.0":
                    pattern = @"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|7F|8[0-46-9A-F]9[0-9A-F])";
                    break;
                case "1.1":
                    pattern = @"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|[19][0-9A-F]|7F|8[0-46-9A-F]|0?[1-8BCEF])";
                    break;
                default:
                    throw new Exception("Error: Invalid XML Version!");
            }

            Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
            if (regex.IsMatch(tmpContents))
            {
                tmpContents = regex.Replace(tmpContents, String.Empty);
                File.WriteAllText(filePath, tmpContents, Encoding.UTF8);
            }
            tmpContents = string.Empty;
        }
    }
}