Printer Friendly Version Print this thread
Email this thread to a friend eMail this thread to a friend
  • Doorway Pages (In: Google)
  • MSN dropping pages out the index (In: MSN Search Engine)
  • Featured Web Site Template

    Hundreds More at Free Site Templates.com!

    Web Site Partners
    Sponsored Links
    Jet City Software
     
    Whos Here ?
    Reflects user activity within the last 5 minutes
    Moderator(s): Prowler
    Member Message

    joe_vimal
    Joined: Mar 22, 2001
    # Posts: 104

    View the profile for joe_vimal Send joe_vimal a private message

    Posted: 2008-Sep-24 09:49
    Edit Message Delete Message Reply to this message

    We have been assigned the task of loading some 5000 pages (plain static HTML) into Mysql database. The idea is to eventually go with a custom designed CMS where the same URL schema will be retained. We have to write a Perl script which will extract the meta tags, body and the URL from all these pages.

    Do we start from the scratch ?

    Any pointer will be much appreciated.



    mj1256
    Joined: Jun 05, 2006
    # Posts: 910

    View the profile for mj1256 Send mj1256 a private message

    Posted: 2008-Sep-24 14:58
    Edit Message Delete Message Reply to this message

    why are you building your own when there are so many that work well available.

    have you looked at
    drupal
    wordpress
    joomla

    or some of the high end commercial ones like vignette

    they even have programs that would import those static pages for you.

    the other issue with creating your own is that as your developers come and go,why things were originally created or done a certain way will be forgotten and future work will be difficult.

    been there, done that

    go with an off the shelf solution and customize that for your needs





    Prowler
    Staff
    Joined: Aug 14, 2000
    # Posts: 1788

    View the profile for Prowler Send Prowler a private message

    Posted: 2008-Sep-25 10:19
    Edit Message Delete Message Reply to this message

    I would have agreed with mj that you can use a ready made CMS for your task. But on second thoughts, you will have problem retaining the same URL structure in the new scheme of things.

    If slurping the contents is your primary problem, it is relatively easy. You can put together a script with Perl modules like HTML:TokeParser. I am just quoting what came to mind. You will find more powerful modules to get the job done.



    You are not permitted to post messages in this forum or topic, because of one or more of the following reasons:
    1. You have not yet logged in, or registered properly as a member
    2. You are a member, but no longer have posting rights.
    3. This is a private forum, for which you do not have permissions.

    If you are a recent member, it's possible that you simply have not yet confirmed your account. Please check your email for a message entitled 'JimWorld Forums: Confirm Your Account' and follow the instructions contained within.

    If you cannot find this message, click here to Re-Send it.

    If you are still experiencing problem, please read the Login Assistance Article for some advice on what may be causing your login not to work properly.

    Switch to Advanced Editor and ... Create a New Topic or Reply to this Thread

    New posts Forum is locked
    © 1995  ·  iWeb, Inc  ·  DBA JimWorld Productions