|
ms
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Parsing HTML pagesIf I have the html from webpage loaded into a string. How would I use regex
to return sections from within that html string? I want to be able to get the "text" back between two different tags. Basically I want to scrape some web pages and populate a database. Does anybody have a snippet of code that could me out get the "text"? "MisterKen" <Mister***@discussions.microsoft.com> wrote in message Is it XHTML? If so you can just read it as an XmlDocument.news:538B602D-D2A2-417D-B777-278C67C6BCDA@microsoft.com... > If I have the html from webpage loaded into a string. How would I use > regex > to return sections from within that html string? > > I want to be able to get the "text" back between two different tags. > Basically I want to scrape some web pages and populate a database. > > Does anybody have a snippet of code that could me out get the "text"? have a look at regexlib.com, they have several expressions that you can
modify. -- Show quoteHide quoteWarm Regards, Alvin Bruney [MVP ASP.NET] [Shameless Author plug] The Microsoft Office Web Components Black Book with .NET Now Available @ www.lulu.com/owc Professional VSTO 2005 - Wrox/Wiley 2006 Blog: http://msmvps.com/blogs/Alvin/ ------------------------------------------------------- "Nick Hounsome" <nh***@nickhounsome.me.uk> wrote in message news:fIxQf.148440$YJ4.73902@fe2.news.blueyonder.co.uk... > > "MisterKen" <Mister***@discussions.microsoft.com> wrote in message > news:538B602D-D2A2-417D-B777-278C67C6BCDA@microsoft.com... > > If I have the html from webpage loaded into a string. How would I use > > regex > > to return sections from within that html string? > > > > I want to be able to get the "text" back between two different tags. > > Basically I want to scrape some web pages and populate a database. > > > > Does anybody have a snippet of code that could me out get the "text"? > > Is it XHTML? If so you can just read it as an XmlDocument. > >
Other interesting topics
ZIP files in C#
c# CP210x Wrapper InteropServices IntPtr HandleRef problem Processing Files C# Books Embedding an image in a dll Using using verses specifying the namespace completely. Form Minimize Adding a button click event to my Main function Q: ColumnChanging How to validate fields on the form in C# |
|||||||||||||||||||||||