Both of these packages allow one to extract data in between the thousands of tags not of interest. As the name suggests, scalpel is very good in cutting one very specific portions of interest, while parsec-tagsoup gives you the absolute control how to handle every single tag, at the cost of more boilerplate.
This short post will not go into detail how to use these two packages. Instead we look at two constructed scenarios and how they can be solved with both of the aforementioned packages.
The inputs are very basic stripped down plain HTML files.
Scalpel really shines when we want to extract data that shares common attributes, possibly to be found scattered all over the HTML source. The really nice part is that it alleviates the programmer from having to write all the open/close tag boilerplate.
Parsec-tagsoup, backed up by the powerful parsec library, has its strong sides when dealing with very detailed localized data, or when conditional parsing is required.