Tuesday, September 23, 2008

RSS XSL Stripping HTML

by Phil 'iwonder' Guerra

I get a fair amount of questions about how to handle transformations, either related to custom XML data islands or RSS newsfeeds. One question that seems to be near the top of the list is "How can I get rid of all HTML in the description of an RSS newsfeed?" Well, the answer involves a bit of study of the source, and then, a bit of tweaking, and testing of a custom XSL.

Searching the web, I located several examples of XSL functions that I could tweak to provide a flexible solution that integrates well into the DotNetNuke framework, either with the News module or the XML/XSL module. I found the code that worked the best for me at this URL:

http://code.techinterviews.com/xslt-to-strip-html/26

Stripping HTML Template

<xsl:template name="strip_HTML">
<xsl:param name="value"/>
<xsl:choose>
<xsl:when test="contains($value,'&lt;')">
<xsl:value-of select="substring-before($value,'&lt;')" disable-output-escaping="yes"/>
<xsl:choose>
<xsl:when test="contains(substring-after($value,'&lt;'),'&gt;')">
<xsl:call-template name="strip_HTML">
<xsl:with-param name="value"><xsl:value-of select="substring-after($value,'&gt;')"/></xsl:with-param>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$value" disable-output-escaping="yes"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>


Using my basic XSL, I added the template for 'stripping-HTML' and tweaked it until it functioned as I needed. Now, it's just a matter of calling the template when I want to use it. In this case, I want to strip the HTML from the <description> element, so the call to the template is done this way:


<xsl:call-template name="strip_HTML">
<xsl:with-param name="value" select="description" />
</xsl:call-template>


No comments:

Post a Comment