Search
Project Description
A library to convert simple or advanced html to plain OpenXml document.

Supported Html tags

Refer to w3schools’ tag list to see their meaning
  • <a>
  • <h1-h6>
  • <abbr> and <acronym>
  • <b>, <i>, <u>, <s>, <del>, <ins>, <em>, <strike>, <strong>
  • <br> and <hr>
  • <img>
  • <table>, <td>, <tr>, <th>, <tbody>, <thead> and <caption>
  • <cite>
  • <div>, <span>, <font> and <p>
  • <pre>
  • <sub> and <sup>
  • <ul>, <ol> and <li>
  • <dd> and <dt>
Javascript (<script>), CSS <style>, <meta> and other not supported tags does not generate an error but are ignored.

Tolerance for bad formed HTML

The parsing of the Html is done using a custom Regex-based enumerator. These are supported:

Samples
Ignore case <span>Some text<SPAN>
Missing closing tag or invalid tag position <i>Here<b> is </i> some</b> bad formed html.
no need to be XHTML compliant Both <br> and <br/> are valid
Color red, #ff0000 and ff0000 are all the red color
Attributes <table id=table1> or <table id="table1">

Dependencies

Use the OpenXml SDK 2.0

Documentation

Don't forget to visit the documentation and drop me your feedback !
Last edited Mar 31 2010 at 10:29 AM by onizet, version 9
Updating...
© 2006-2012 Microsoft | Get Help | Privacy Statement | Terms of Use | Code of Conduct | Advertise With Us | Version 2012.1.11.18365