Printer Friendly Version
Email this thread to a friend
|
For Sale Russia/USA Marriage/Dating/Meeting Site (In: I Want to Sell My Website)
I want to sell my site, mzkforums.com (In: I Want to Sell My Website)
I want to sell my site, mzkforums.com (In: I Want to Sell My Website)
For Sale - Child Gift Web Site (In: I Want to Sell My Website)
For Sale PR3 Site with Lots of Authority (In: I Want to Sell My Website)
Featured Web Site Template |
|
Reflects user activity within the last 5 minutes
|
|
| Member |
Message |
joe_vimal
Joined: Mar 22, 2001
# Posts: 104
|
Posted: 2005-Sep-30 08:32
Anyone heard of a scraper script which extracts job ads from many sites and dumps them into a database ?
I could see tons of other scripts in all the usual places, but not one along this line. Writing a script from the scratch seems daunting. Any help would be much appreciated.
|
 |
masidani
Joined: Oct 21, 2005
# Posts: 10
|
Posted: 2005-Oct-24 08:24
Joe,
I think it would have to be custom-written, I'm afraid. The reason is that the script needs to know how the HTMl code on each of the sites is written in order to know where to find the job data in the HTML.
If you visit each of the job sites yourself and look at the HTML source code, you'll see that each one is different. The "screen scraper" program will need to know where to look in each page to find things like job title, salary, location etc., which will be different in each case. Hence it will need to be custom-written.
That said, a Perl program with LWP::Useragent library and a few regular expressions will suffice, so long as there are no login/registration procedures etc. that need to be dealt with.
Simon
|
 |
joe_vimal
Joined: Mar 22, 2001
# Posts: 104
|
Posted: 2005-Oct-26 16:02
Thanks Simon. I was afraid I would have to start from the beginning. There are other issues too. Will I be infringing on some copy right law if the script scrapes a couple of lines from many sites ?
|
 |
bhartzer
Staff
Joined: Jun 08, 2000
# Posts: 7042
|
Posted: 2005-Oct-26 17:44
Will I be infringing on some copy right law
Yes.
|
 |
joe_vimal
Joined: Mar 22, 2001
# Posts: 104
|
Posted: 2005-Oct-27 08:22
Thanks bhartzer. I knew something like this would happen. Ok. I have read somewhere that if you quote a couple of lines from any site in your site and use appropriate credit, you will not be hauled up for violation of copyright. Is this true ?
I am sorry I am asking this in a Perl forum.
|
 |
lizardz
Joined: Nov 12, 2004
# Posts: 1394
|
Posted: 2005-Oct-27 20:00
Use of a few lines is fair use I believe, that's not copyright infringement.
That's why you can quote somebody's writing for example, but not duplicate their whole article, but you can quote from an article.
|
 |
excell
Staff
Joined: Mar 19, 2001
# Posts: 14512
|
Posted: 2005-Oct-27 20:04
a scraper script - automation of the process of taking content...yes I would be careful with what you create.
|
 |
joe_vimal
Joined: Mar 22, 2001
# Posts: 104
|
Posted: 2005-Oct-28 06:43
No way excell. I perfectly understand and abhor the stealing of content from others. But what I am interested is - we want to populate the database of a jobsite with enough job offers to make the site attractive for the job seekers. Our client does not wish to infringe any laws and we won't either.
Scraping a line of content from other sites is perfectly acceptable if you don't overdo it. eg: For SEO purposes, many scrape the search results pages of search engines:
Results 1 - 100 of about 3,640,000 for 'keyword'
Same way we use snippets of information from weather sites too usually with the express consent from the webmasters.
In our case, even a couple of lines might be frowned upon as the snippet of imformation has some commercial value.
I am confused. We don't want to be associated with any route that will even remotely land us in trouble. Losing this client in such a case would be preferable. What is the consensus of the Ladies and Gentlemen here ?
|
 |
You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
- You have not yet logged in, or registered properly as a member
- You are a member, but no longer have posting rights.
- This is a private forum, for which you do not have permissions.
If you are a recent member, it's possible that you simply have not yet confirmed your account. Please
check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions
contained within.
If you cannot find this message, click here to Re-Send it.
|
If you are still experiencing problem, please read the
Login Assistance
Article for some advice on what may be causing your login not to work properly.
|
Switch to Advanced Editor and ...
Create a New Topic
or Reply to this Thread
|
|