4/29 I need help getting information off a web site. A page presents
information about an item in locations spread througout the page.
Each page presents information about one item. What is a quick and
easy way to go through several pages, capture all the information
related to each item, and put them into a spreadsheet with a unique
index? I think this might be possible by scraping the screen, but how
does one go about this from a Windows workstation (with no app
servers)? Would it be easier to record a bunch of copy and paste
actions with automation / macro recording software and replay the
macro?
\_ On a windows machine with dotnet you can just simply write
the whole thing in a couple lines of C Sharp. They've even
got a snarf util in the O'Reilly book.
\_ perl. -tom
\_ Typical Tom answer. Tom, when you dont know much about
something, why don't you leave it for others to answer?
\_ what do you mean? perl is a fine solution. -tom
\_ WWW:Mechanize is a valid suggestion. "Use perl" is a
step away from "write a prgrogam." Sad you can't see
this.
\_ If you know anything about perl, you know
there's more than one way to do it. I wouldn't
use WWW::Mechanize, though that's certainly a
reasonable approach. -tom
\_ more specifically, WWW::Mechanize is useful -dwc
\_ Python's urlib module does this quite easily also. -scottyg |