Post Reply
kyodylee
Posts: 36
Joined: Fri Jun 13, 2003 3:15 am

Webscrape for Astronomy Pic of the Day

Post by kyodylee » Fri Oct 29, 2004 6:01 pm

Astronomy Pic of the Day ... http://antwrp.gsfc.nasa.gov/apod/ ... is just about my most favorite RSS feed and now it looks like the feed is no longer being maintained by it's third party source (the site itself does not have a feed) ... http://services.perceive.net/xml/apod_rss10.xml. :(

I tried to use Webscrape to scrape my own feed, but I'm not a programmer and I don't know how to write a script for the page. And since MyRSS no longer works either, well ... no more Pic of the Day. :cry:

I know how busy everyone is, but if there is someone out there that also likes this site and would be willing to write a script for it to use in Webscrape, I would be very eternally happy and grateful. :please:

Thanks all. :D

User avatar
support
Site Admin
Posts: 3032
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Webscrape for Astronomy Pic of the Day

Post by support » Sat Oct 30, 2004 5:06 am

kyodylee wrote:Astronomy Pic of the Day ... http://antwrp.gsfc.nasa.gov/apod/ ... is just about my most favorite RSS feed and now it looks like the feed is no longer being maintained by it's third party source


The WebScrape plugin was written by Allan Wilson and if the thought of these pics as a feed doesn't get him drooling, I don't know what will :-)

It's a shame about myrss.com but I guess they're right, there really is a business opportunity in providing scraped feeds :shock:

abwilson
Posts: 247
Joined: Sun Feb 09, 2003 12:36 am
Location: San Francisco, CA -- USA

Post by abwilson » Sat Oct 30, 2004 6:00 pm

You know me too well! :)

As a matter of fact, once I saw kyodylee's post yesterday, I had a look and fortunately it was easy. However, the page happens to contain a relative URL that spans multiple lines -- which WebScrape as published didn't handle. So, I have updated WebScrape.zip and Taka should be able to make it available to everyone on awasu.com.

I also included my AstronomyPoD.ini in the new WebScrape.zip, but for immediate info, here it is:

Code: Select all

[ChannelParameters]
URL=http://antwrp.gsfc.nasa.gov/apod
BaseURL=http://antwrp.gsfc.nasa.gov/apod/
Title=Astronomy Picture of the Day
Description=Photograph retrieved from Website
MaxItems=1
Shorthand=
SectionPattern=
ItemPattern-1=(?P<D><center>\s+<h1>.*?
ItemPattern-2=<center>\s*<b> (?P<T>.*?) </b> <br>.*?
ItemPattern-3=(?P<L>)<p> <hr>)


Let me know how you like it!

Guest

Post by Guest » Sun Oct 31, 2004 12:27 am

Allan - YIPPEE!!!! :bowdown: Thank You! Thank You! Thank You! :D

The feed is working! It actually provides more information than the original feed. However, the picture itself isn't downloading, just a place holder for the picture. I can click on the picture name though and it takes me to the website for the picture. Will the new webscrape.zip correct this? or is my 'puter not doing what it's supposed to do?

Thanks again for such a fast and speedy reply. I think MyRSS has it right. I would definitely pay for a program that did this automatically since I don't know how to write these scripts!

p.s. Allan, check your tip jar! :)

arrg - I wasn't signed in, but it's me kyodylee. :oops:

abwilson
Posts: 247
Joined: Sun Feb 09, 2003 12:36 am
Location: San Francisco, CA -- USA

Post by abwilson » Sun Oct 31, 2004 12:55 am

Great -- glad you like it!

Yes, the updated WebScrape.exe (in the new WebScrape.zip) will indeed fix the image display problem you're having. By the way, thanks for letting me know about a site that happened to reveal a limitation in the way I was handling relative URLs; the new version fixes the problem (and allows your new "feed" to display properly).

Also, as Taka expected, I think it's a great page to WebScrape. :D

Allan

User avatar
support
Site Admin
Posts: 3032
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Post by support » Sun Oct 31, 2004 3:20 am

Um, I was just going to post to let you guys the new WebScrape is up in the downloads section but it looks like you're already sorted :-)

Guest

Post by Guest » Sun Oct 31, 2004 5:34 am

Thanks again both Allan and Taka. The updated version works like a charm! Picture perfect! I really can't thank you enough. :D

And of course I had no way of knowing that a little selfish request on my part would actually help reveal a limitation of the program, so it's also very nice to know that some greater good was also accomplished!

Cheers mates! ;)

kyodylee
Posts: 36
Joined: Fri Jun 13, 2003 3:15 am

Post by kyodylee » Sun Oct 31, 2004 5:47 am

I know it's Halloween (and slightly OT) but ... apparently somehow www.awasu.com got put into IE's restricted site list ... I have no idea how, I didn't do it! ... and that is what was causing me not to be able to post under my login name. Then when I fixed the restricted site list, I couldn't edit my own post above! Ghosts and goblins at play with my computer! :twisted: :wink:

abwilson
Posts: 247
Joined: Sun Feb 09, 2003 12:36 am
Location: San Francisco, CA -- USA

Post by abwilson » Mon Nov 01, 2004 2:06 am

I'm also pleased everything seems to be working well with the Astronomy Picture of the Day scraping. You have your "feed" back and we nailed a bug in the process.

Thanks

Allan

kyodylee
Posts: 36
Joined: Fri Jun 13, 2003 3:15 am

Post by kyodylee » Tue Nov 02, 2004 4:55 am

Well, everything was going just fine until today when the Astronomy PoD feed needed to update for the first time and failed to do so. :(

When I tried to reinstall the feed, thinking that might fix the problem, webscrapesettings.exe is giving me the following error message: download failed:403 forbidden. I get the error message on any of the .ini files included with the plug-in that I try to install.

:help:

abwilson
Posts: 247
Joined: Sun Feb 09, 2003 12:36 am
Location: San Francisco, CA -- USA

Post by abwilson » Tue Nov 02, 2004 5:04 am

While it won't make you feel any better, it's still working fine for me. Didn't you mention some strange goings-on with your system and restricted sites?

403 means you are basically not being allowed access.

Taka, any ideas or things to try?

Allan

kyodylee
Posts: 36
Joined: Fri Jun 13, 2003 3:15 am

Post by kyodylee » Tue Nov 02, 2004 8:25 am

Ok, I got the feed working again. :D

The problem was with Zone Alarm not letting webscrape.exe have access to the internet. But I had set this setting to allow webscrape.exe access when I initially installed the feed, and the feed was working, so I'm not sure why or how the setting got changed. :?

Well, now I know to check ZA first if I have another problem. :)

abwilson
Posts: 247
Joined: Sun Feb 09, 2003 12:36 am
Location: San Francisco, CA -- USA

Post by abwilson » Tue Nov 02, 2004 4:26 pm

Well, congratulations on getting things working again.

Post Reply

Return to “Awasu - Extensions”