Talk:Web scraping
This is the talk page for discussing improvements to the Web scraping article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1Auto-archiving period: 2 months ![]() |
![]() | Internet C‑class High‑importance | |||||||||
|
![]() | Computing C‑class Mid‑importance | |||||||||
|
This is the talk page for discussing improvements to the Web scraping article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1Auto-archiving period: 2 months ![]() |
legal issues
I don't this article's discussion of the legalities of scraping is correct, and I'm disputing its neutrality. The DMCA prohibits technical measures to bypass an effective access control measure. A robot acting like a browser bypasses no effective measures in doing so, and thereby doesn't fall afoul of the DMCA. Also, redistributing copyrighted material is illegal regardless of whether the DMCA is invoked.
Furthermore, not all material gotten through screen-scraping is copyrighted. Consider the case of a site that displayed film showtimes. The showtimes themselves are not copyrighted any more than the numbers in a phone book are, and therefore can be used by whoever scrapes them without fear of copyright infringement. Wholesale copying of content is illegal, yes, but it's not an issue specific to "web scraping."
Also, performing an action that violates a site's terms of use is not illegal. It merely violates the terms of use, not any law. It's not even a breach of contract, since the user doesn't even have to read, much less agree to the terms to use the site.
Also, I demand a citation for the "courts have held" claim. I find it unlikely, though not entirely impossible. — Preceding unsigned comment added by Quotemstr (talk • contribs) 03:26, July 27, 2007 (UTC)
legal issues section reworked
The legal issues section made several bold and unsourced claims that could be interpreted as scare-mongering. Can someone check out the reworked section? —The preceding unsigned comment was added by Quotemstr (talk • contribs) 00:06:14, August 20, 2007 (UTC).
Legal issues again
I'm not starting an edit war, I swear. :-)
First of all, I cleaned up and normalized the references a bit, and made some minor phrasing changes that shouldn't be controversial.
I removed the section about legal action occurring out of the public eye. That information isn't only unsourced: it's unverifiable.
The court cases cited in the article hardly count as defeats. In the Ticketmaster case, the court held that the particular instance of scraping mentioned was not a trespass. In the other cases listed, the claim was for a preliminary injunction only. As I understand it, a preliminary injunction does not set case law, and should not be considered with the same weight as a final decision.
As for the aggregate damage section -- is there a specific source? Maybe I just missed it.
I don't see how the DMCA is relevant here either; the cases mentioned in the previous version seem to be covered by normal copyright law. A scraper doesn't necessarily have to circumvent any access restrictions in place on a site, considering that one can act like just a browser. Also, doesn't the DMCA specifically allow circumvention for interoperability? —The preceding unsigned comment was added by Quotemstr (talk) (contribs) 02:30, August 21, 2007 (UTC)