Darren Mothersele

Software Developer

Warning: You are viewing old, legacy content. Kept for posterity. Information is out of date. Code samples probably don't work. My opinions have probably changed. Browse at your own risk.

SimpleHTMLDOM Parser for Drupal

Aug 17, 2011

web-dev

Import content from other webpages using Feeds and some HTML DOM magic. I've created this module, as importing HTML content is a task that comes up now and again, and I wanted a more generic way of doing it. This is useful for many things, including monitoring sites that don't support RSS, importing legacy content to a Drupal site, screen scraping, etc. To do this I've created a module called SimpleHTMLDOM Parser and published it on Drupal.org. Read on for documentation...

Installation

Example Configuration

Example extracting a title from each matched list item:

Example extracting an image from each matched item. Notice that in this case a prefix is added to make a full URL out of the returned value:

Only basic documentation now, but let me know if you have any questions or interesting usage ideas for this module. If you have any problems please raise them on the issue queue via the project page on drupal.org.