Tuesday, June 05, 2007

Regular Expressions for work with HTML

Here is some methods to work with HTML which I have written today. They are pretty simple and don't catch issues like different opening and closing html tags but works fine for me :)

private static string GetStringWithoutHtmlTags(string s)
{
Regex htmlTagsRegex = new Regex(@"");
MatchCollection mCollection = htmlTagsRegex.Matches(s);
return htmlTagsRegex.Replace(s, "");
}
private static StringCollection GetTextInAllHtmlTags(string s)
{
StringCollection resultStringCollection = new StringCollection();
Regex htmlTagsRegex = new Regex(@"<\w+>(?\w+)");
MatchCollection mCollection = htmlTagsRegex.Matches(s);
foreach (Match m in mCollection)
{
resultStringCollection.Add(m.Groups["innerHtml"].Value);
}
return resultStringCollection;
}

No comments: