jerrymartin
Posts: 27
Joined: Mon Mar 09, 2009 6:29 pm
Contact:

Full Article Content (Again)

Post by jerrymartin » Mon Mar 09, 2009 6:49 pm

Hi,

I've been reading these forums of and on for weeks now trying to find out how to do something that seems so simple. I've seen some discussion on it but the answers seem way more complicated then they need to be. Or the feature just isn't really easily available as it should be.

I want to see the full content of a feed without clicking on the summary and going to the publishers website. I want to be able to send it via email, ftp, sql database or whatever which I can use the plug ins to accomplish. But right now the only way I can get to the full content is by clicking on the summary and going to the site. I know that you can download for offline reading and all of that but what good does that do me if I'm trying to send the full content to an email or database.

I've tried the channel reports to do it and even though I know the article has been downloaded for offline reading and I have also selected full content I still don't get it in the report. Am I just being dense here? This should be that hard.

If I was to do this manually it would go something like this.

1. Get the Feed
2. Click a link
3. Go to the web page and save it as a mht or txt, or just send it via an email, or db.

In other words what I want to end up with is pretting much what you get when you print a web page (without printing the background).

Am I totally missing something here?

Help would be appreciated.

jm

User avatar
support
Site Admin
Posts: 3073
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Full Article Content (Again)

Post by support » Mon Mar 09, 2009 8:36 pm

jerrymartin wrote:Am I totally missing something here?

Not really, there are currently no built-in features that do the things you describe :-)

It's not quite as easy as you might think. Currently, when Awasu generates a channel page or report, *fragments* of HTML (each item) are assembled into a single page. When it downloads linked-to pages, it grabs the *entire* page (as an MHT) and IE doesn't allow these to be embedded into a page. That's why you don't see the full page for an item, even if it has already been downloaded for offline reading. When you click on the item, Awasu gives the browser the MHT it has downloaded, and the browser shows that page (and nothing else).

The easiest way might be to write a channel hook that gets called every time new items arrive, downloads them to an MHT and then you can do whatever you want with it. Alternatively, if you let us know what you're trying to do, I can put something in for the next release.

User avatar
kevotheclone
Posts: 245
Joined: Mon Sep 08, 2008 7:16 pm
Location: Elk Grove, California

Re: Full Article Content (Again)

Post by kevotheclone » Tue Mar 10, 2009 7:33 am

Since you added (Again) to your subject I guess you've already seen the previous posts on a similar topic where I wrote a simple Python script that downloads the complete page referenced by the feed item and saves it as a text file, removing all HTML elements.

http://www.awasu.com/forums/viewtopic.php?t=7287

After I wrote the script and tested it I forgot about it and later wondered why I had all of these text files in the root directory of my C:\ drive. Then I remember this script. So it does work, in fact it'll keep working for months on end.

The original poster on that thread wanted raw text to feed into a text-to-speach program, but if you want the raw HTML just remove the regular-expresion-based code.

I don't know the API for saving as an MHT file, but it sounds like Taka may know it. You could also write a plugin that hosts the MSHTML WebBrowser control that could Navigate() to the full web page content and SaveAs() an MHT. I've also seen code examples to save web pages to Compiled HTML Help (CHM) files.

If you don't want the action to be completely automatic "2. Click a link" then you could call very similar code as "Send to" user tool.

User avatar
support
Site Admin
Posts: 3073
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Full Article Content (Again)

Post by support » Tue Mar 10, 2009 8:13 am

kevotheclone wrote:I don't know the API for saving as an MHT file, but it sounds like Taka may know it.

You don't want to know. It's a Microsoft thing and it's seriously flaky :-( so much so that I had to extract it out into another process so it doesn't affect Awasu when it goes bananas.

Run <tt>gmht.exe</tt> (in the Awasu installation directory) like this:

Code: Select all

GMHT.EXE {url} 0

and it will print out the path to a temp file that contains the MHT. The "0" represents the values described here: msdn.microsoft.com/en-us/library/ms526977(EXCHG.10).aspx

kevotheclone wrote:If you don't want the action to be completely automatic "2. Click a link" then you could call very similar code as "Send to" user tool.

Hmm, that would be cool :cool: You're really getting into these <i>Send to</i> tools :-)

User avatar
kevotheclone
Posts: 245
Joined: Mon Sep 08, 2008 7:16 pm
Location: Elk Grove, California

Re: Full Article Content (Again)

Post by kevotheclone » Wed Mar 11, 2009 12:10 am

You're really getting into these Send to tools

Yes I am. Send to tools are yet another seriously cool 8) feature. Anytime you want to do something manually with a single feed item, think Send to tools.
They're a great compliment to Awasu's automated features.
I've tested both HTTP-based and file-system-based (EXEs and/or scripts) Send to tools, and both work well but the parameter retrieval within an HTTP-based Send to tools is much easier due to the way the most web frameworks are pre-designed to parse a query string.

As we've discussed in the past, in a few more months I'll have more time to focus on finalizing some of the Application Plugins, Channel Summary and Channel Report Templates, Send to tools, and some other Awasu goodies I've been working on. I want to have enough documentation available for each item to get a user up and running quickly. I finally got around to creating an AwasuWiki account last night, so when my Awasu extensions are ready I'll be ready to upload them.

Run gmht.exe (in the Awasu installation directory) like this...

Just tried it and it worked well. I've seen the gmht.exe sitting there and wondered what it was for, but I didn't make the connection between the exe name and the MHT file extension, and I haven't used Awasu in offline mode yet, so I didn't know that it created MHT files.

http://msdn.microsoft.com/en-us/library/ms526977(EXCHG.10).aspx

Wow, saving as an MHT file is in the CDO library, I never would have guessed it. I've used CDO in the past to send/retrieve email and access our Exchange server address books, I wouldn't have thought that saving an MHT file would be part of CDO.

Bottom line:
It looks like a Send to tool to retrieve the Full Article Content and save it as an MHT file wouldn't be that difficult to create. :coolthumb:

jerrymartin
Posts: 27
Joined: Mon Mar 09, 2009 6:29 pm
Contact:

Re: Full Article Content (Again)

Post by jerrymartin » Wed Mar 11, 2009 5:03 pm

support wrote:
jerrymartin wrote:Am I totally missing something here?

Not really, there are currently no built-in features that do the things you describe :-)

The easiest way might be to write a channel hook that gets called every time new items arrive, downloads them to an MHT and then you can do whatever you want with it. Alternatively, if you let us know what you're trying to do, I can put something in for the next release.


Thanks for your reply.

The more I try to figure out how to do this the more I realize why it would be so difficult to do from a simple gui standpoint. You did give me the idea to print a channel report which is an MHT and then click all the links from there. At least that way it is more structured and I might be able to automate the tasks.

What I am trying to do is very similar to what was described in the other post regarding this subject. But I guess I want to have it both ways. I would like the plain text version so that I could have all my news read to me but I would also like to grab any photos or videos that might be displayed on the full content page as well.

I was a gui database programmer for many years but I don't know any of the scripting languages so doing my own hooks is an uphill battle.

Another thing I remembered and tried is printing the content page to a pdf file. This usually gets rid of all the background stuff. Of course you have to get to the full page first.

Thanks all for your thoughts.

jerrymartin
Posts: 27
Joined: Mon Mar 09, 2009 6:29 pm
Contact:

Re: Full Article Content (Again)

Post by jerrymartin » Wed Mar 11, 2009 5:14 pm

kevotheclone wrote:Since you added (Again) to your subject I guess you've already seen the previous posts on a similar topic where I wrote a simple Python script that downloads the complete page referenced by the feed item and saves it as a text file, removing all HTML elements.

http://www.awasu.com/forums/viewtopic.php?t=7287

After I wrote the script and tested it I forgot about it and later wondered why I had all of these text files in the root directory of my C:\ drive. Then I remember this script. So it does work, in fact it'll keep working for months on end.

The original poster on that thread wanted raw text to feed into a text-to-speach program, but if you want the raw HTML just remove the regular-expresion-based code.

I don't know the API for saving as an MHT file, but it sounds like Taka may know it. You could also write a plugin that hosts the MSHTML WebBrowser control that could Navigate() to the full web page content and SaveAs() an MHT. I've also seen code examples to save web pages to Compiled HTML Help (CHM) files.

If you don't want the action to be completely automatic "2. Click a link" then you could call very similar code as "Send to" user tool.


Yes I did read through that post but I think I got the feeling that nothing solid ever came of it, and seeing that I am not a script programmer I didn't know exactly how to implement all the code snippits. I'm going to reread it all and see if I can figure out what to do with it but hopefully in an effort to make awasu more useful for the non coders the developers will start updating and adding hooks and plugins.

Awasu seems to be a great product but when you look at the very dated hooks/plugins you have to wonder how much this program is being used and supported. Thanks to the quick replies from support and the comments about adding to the product I am a little more confident about continuing to use this product.

Thanks again.

User avatar
kevotheclone
Posts: 245
Joined: Mon Sep 08, 2008 7:16 pm
Location: Elk Grove, California

Post subject: Re: Full Article Content (Again)

Post by kevotheclone » Fri Mar 13, 2009 7:26 am

Yes I did read through that post but I think I got the feeling that nothing solid ever came of it

True, I wrote the script to demonstrate that the original poster's request could be fulfilled by a Channel Hook. I did not have the need myself for a "save to text file" hook at that time. I don't work for Awasu, I'm just another user who has a full time job and a family (and I'd like to spend more time with my guitars :afro:), so I may not fully develop a robust solution for another user, but I may give them something that gets them 80% of the way there. The original poster didn't even reply, so what was I supposed to do to make it more "solid". They asked for it, and then they abandoned it. I know there's a couple of changes I'd make to that Channel Hook to eliminate CSS and JavaScript that appear in the body of the document. I'd also add an option whether or not to overwrite a file with the same name. After that I don't know what else to do with this other than to zip it with some instructions and upload it to Awasu's wiki.

I've got some things going on in my personal life that will eat up a lot of my free time in the next couple of months; and a few more things after that, but I'm going to try to release, as open source, anything that I develop for Awasu that I think other users would like.

am not a script programmer

Are you skilled in any Windows based languages, or are you a unix, mac or mainframe guy? If you've done some Windows programming, would you care to share the name of a langauge or two that you are familar with? Have you done any web development?

how much this program is being used and supported

I don't know how many Awasu users there are, but I do think Awasu is supported very well. Taka is very quick to reply to users in the forums and every time I've email him. Look what's happened in the last week... Thanks to your request Taka shared how to automate a save to an MHT file (without using VB's SendKeys statement, something I did in the past).

make awasu more useful for the non coders

There are a lot of feed readers on the market, but very few that are that extensible like Awasu. You may find some feed readers that have a feature or two that are a few mouse clicks away instead of a plugin away; but you may quickly reach the limit of what you can do with them. It's rare to find a product that has every feature that you could possibly want, with Awasu and a little coding skills you can usually fill in the gaps.

I actually started writing (offline) a reply to your previous posts, I'll take another look at it and try to post it soon.

Stay tuned...

User avatar
kevotheclone
Posts: 245
Joined: Mon Sep 08, 2008 7:16 pm
Location: Elk Grove, California

Post subject: Re: Full Article Content (Again)

Post by kevotheclone » Fri Mar 13, 2009 7:42 am

On Programming Languages
You don't have to know any specific scripting languages to program with Awasu, I'm not a skilled Python programmer but for extending Awasu I'm making a point of learning Python because 1) I always wanted to but didn't have a driving reason to use it, 2) it is a fairly simple and at the same time powerful language with some very advanced libraries available and 3) Taka seems to know it, so I might be able to get some help from him and/or anything I create/share he might be able to tweak/support if "something unsavory" were ever to happen to me.

You can see that Python's source code looks almost basic-like, so they are usually simple to read. People like it because you can usually get right to the heart of problem solving without having to build a bunch of "scaffolding" just to start writing a simple program.

You can use any language you like as long as it can read Windows INI files, read and write to stdin, stdout & stderr, read command line arguments. Beyond that your language needs to be able to do whatever you need it to do, read/write to the file system, read/write to a database, communicate over HTTP/FTP/POP3/SMTP, perform regular expressions, parse HTML/XML, etc.


"But I guess I want to have it both ways"

You probably can have it both ways. It'll take some work, but it's probably possible in the Awasu realm.

On Channel Hooks
One of the beautiful things about Channel Hooks is that you can attach multiple Channel Hooks to the same channel. Each one can perform a specific function (save to text for TTS conversion, save to MHT, extract images, etc.). Each Channel Hook can react to the same event or to different events it's all up to how you specify it in the .hook file.


"doing my own hooks is an uphill battle"

It's always an up hill battle when you start something new. I was in your shoes 6 months ago. Remember when you first started programming everything was an uphill battle, with Awasu it's going to be like that for a little while, there's a lot to learn. Roll up your sleeves, get your hands dirty, stub your toe a few times and you'll be fine. The first time I read how a Channel Hook was invoked, I thought it was a little Rube Goldberg-ish (http://en.wikipedia.org/wiki/Rube_Goldberg). "I create a .hook file so that Awasu knows to call my Channel Hook program for certain event(s) and then Awasu is going to pass into my Channel Hook program an INI file with information about the event." My head was spinning, but you read it a couple of times and it starts to make sense and it ends up being really easy to create a Channel Hook. As Taka recommends: "Attach the supplied LogChannelActivity channel hook to your channel to examine how Awasu invokes hooks for each type of event." LogChannelActivity will help you understand how Channel Hooks are invoked and what data is provided in the INI file.

Your Original Request
I'm not 100% sure what you want to do. But I'll grab some of your words and specifically try to find a solution.


Problem:
"I want to see the full content of a feed without clicking on the summary and going to the publishers website"
Awasu already does display the "full content of a feed", I think you mean the full content of the web page referenced by the feed item.

Possible solution:
Add an <iframe> element to the defaultItemBody.include file; size it appropriately.
Example:
<iframe src="{%ITEM-METADATA% url encode=sgml chars=<&\"}" width="100%" height="500px"></iframe>

It may not be pretty (headers, footers, ads), but it works. If this is kind of what you want to create a totally new Channel Summary Template that has very little other information as it can start to look cluttered. Also, Internet Explorer will probably complain with its yellow "Information Bar" so you'll need to Ok it each time or turn it off by unchecking "Enable enhanced IE security" in the "Display" section of the "Options" dialog.

I'm no expert but this technique should also work in Channel Report, Search Channel and Search Result Templates.


Problem:
"I want to be able to send it via email, ftp, sql database or whatever which I can use the plug ins to accomplish"
By "it" I guess you mean the full page content. By "send it" I'm not sure if you mean automatically "send it" for every item in the channel or on a one-by-one basis after you read the article.

Possible solution:
If "send it" means automatically, use one or more Channel Hooks; if it means one-by-one, use Send to tools. If you are using your own Send to tools then make sure that you have the "Send to" HTML & template parameters defined in your Channel Summary Template (everything between <!-- BEGIN SEND-TO --> and <!-- END SEND-TO -->). Also if it's the one-by-one Send to approach you want the {%ITEM-METADATA% url encode=sgml chars=<&\"} template parameter is exactly what you would pass to your Send to tool to get the URL of the web page referenced by the feed item. If it's the Channel Hook route you want to go then getting the URL would be a little different.


"The more I try to figure out how to do this the more I realize why it would be so difficult to do from a simple gui standpoint."


You've seen the full text channel hook I previously posted (I just counted, it weighs in at a whopping 73 lines including comments and blank lines. ;-)):
http://www.awasu.com/forums/viewtopic.p ... highlight=

That's the first and only Channel hook I've written, I'll write more someday soon. A similar Save As MHT channel hook would be even easier:
Get the URL from Awasu, pass it to GMHT.EXE, get the file named returned by GMHT.EXE, move the file to where you want it or send it via email.

Now come down off that ledge, before you hurt someone, you've got some Awasu extensions to work on.
Hell, come down off that ledge, give me a week and an extra large Awasu T-shirt and I'll code the "Save as MHT" Channel Hook and Send to tool, for you.
As for your "full web page" templates, that's kind of personal, like your boxer shorts, I think you and I both would prefer you to handle those yourself. :lol:

User avatar
support
Site Admin
Posts: 3073
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Post subject: Re: Full Article Content (Again)

Post by support » Fri Mar 13, 2009 12:59 pm

kevotheclone wrote:<iframe src="{%ITEM-METADATA% url encode=sgml chars=<width></iframe>

Holy cow, that's cool! I had thought about iframe's when reading his original post but got stuck trying to figure out how to unpack the MHT content into the page :doh:

jerrymartin
Posts: 27
Joined: Mon Mar 09, 2009 6:29 pm
Contact:

Re: Post subject: Re: Full Article Content (Again)

Post by jerrymartin » Fri Mar 13, 2009 5:50 pm

True, I wrote the script to demonstrate that the original poster's request could be fulfilled by a Channel Hook. I did not have the need myself for a "save to text file" hook at that time. I don't work for Awasu, I'm just another user who has a full time job and a family (and I'd like to spend more time with my guitars :afro:),
.

I hope I didn't imply that you were slackin, and there is nothing more important then spending time with your family and hobbies. In fact, that is the whole point in trying to automate this task. I have a two year old and a wife I would like to spend time with instead of sitting here and copying and pasting my life a way.

I've got some things going on in my personal life that will eat up a lot of my free time in the next couple of months; and a few more things after that, but I'm going to try to release, as open source, anything that I develop for Awasu that I think other users would like.


I would like to contribute too once I get my own tasks done.

am not a script programmer

Are you skilled in any Windows based languages, or are you a unix, mac or mainframe guy? If you've done some Windows programming, would you care to share the name of a langauge or two that you are familar with? Have you done any web development?


I was a 4gl, RAD, SQL, Client/Server, type of programmer for about 15 years. I use to program using a RAD product called Omnis Studio (omnis.net if you want to look at it), its similar to 4th Dimension or FileMakerPro (on a bottle or steroids). You don't hear much about Omnis because the only people that can afford it is the Fortune 500 like Disney, Hughes, Raytheon, NIH, the Gov, etc. They all have there own languages. Bottom line is I know a little html, SQL, and I am working on learning PHP, JavaScript, and maybe some Ajax if I have time. So learning Python wasn't very high on my list but I will learn what I have to to get the job done.

how much this program is being used and supported


I don't know how many Awasu users there are, but I do think Awasu is supported very well. Taka is very quick to reply to users in the forums and every time I've email him.


I can see that, so as long as I can see there are some bodies out that there that plan to keep the product moving forward and making it better I'm happy.

make awasu more useful for the non coders

There are a lot of feed readers on the market, but very few that are that extensible like Awasu. You may find some feed readers that have a feature or two that are a few mouse clicks away instead of a plugin away; but you may quickly reach the limit of what you can do with them. It's rare to find a product that has every feature that you could possibly want, with Awasu and a little coding skills you can usually fill in the gaps.

I have seen most of the other readers and I do agree the Awasu is way more advanced then the others. That's why I am here.

I actually started writing (offline) a reply to your previous posts, I'll take another look at it and try to post it soon.

Stay tuned...


I'm tuned in and thanks for your help.

Jerry

jerrymartin
Posts: 27
Joined: Mon Mar 09, 2009 6:29 pm
Contact:

Re: Post subject: Re: Full Article Content (Again)

Post by jerrymartin » Fri Mar 13, 2009 6:34 pm

kevotheclone wrote:On Programming Languages
You don't have to know any specific scripting languages to program with Awasu, I'm not a skilled Python programmer but for extending Awasu I'm making a point of learning Python because 1) I always wanted to but didn't have a driving reason to use it, 2) it is a fairly simple and at the same time powerful language with some very advanced libraries available and 3) Taka seems to know it, so I might be able to get some help from him and/or anything I create/share he might be able to tweak/support if "something unsavory" were ever to happen to me.

You can see that Python's source code looks almost basic-like, so they are usually simple to read. People like it because you can usually get right to the heart of problem solving without having to build a bunch of "scaffolding" just to start writing a simple program.

You can use any language you like as long as it can read Windows INI files, read and write to stdin, stdout & stderr, read command line arguments. Beyond that your language needs to be able to do whatever you need it to do, read/write to the file system, read/write to a database, communicate over HTTP/FTP/POP3/SMTP, perform regular expressions, parse HTML/XML, etc.

I'll try an obsorb all of this and get back to you. :D

"But I guess I want to have it both ways"

You probably can have it both ways. It'll take some work, but it's probably possible in the Awasu realm.

I believe so too, and I'll be working on it.

On Channel Hooks
One of the beautiful things about Channel Hooks is that you can attach multiple Channel Hooks to the same channel. Each one can perform a specific function (save to text for TTS conversion, save to MHT, extract images, etc.). Each Channel Hook can react to the same event or to different events it's all up to how you specify it in the .hook file.

I'm looking in to it all.

Your Original Request
I'm not 100% sure what you want to do. But I'll grab some of your words and specifically try to find a solution.


Awasu already does display the "full content of a feed", I think you mean the full content of the web page referenced by the feed item.


Yes, the feed usually only contains the link and an excerpt. Clicking on the link takes you to the web page with the whole article. I'd be somewhat happy with just that but what I am after is just the text for the article and not everything else on the page. I would imagin just getting the text for the article would be pretty hard to do but I can hope can't I?

Possible solution:
Add an <iframe> element to the defaultItemBody.include file; size it appropriately.
Example:
<iframe src="{%ITEM-METADATA% url encode=sgml chars=<width></iframe>

It may not be pretty (headers, footers, ads), but it works. If this is kind of what you want to create a totally new Channel Summary Template that has very little other information as it can start to look cluttered. Also, Internet Explorer will probably complain with its yellow "Information Bar" so you'll need to Ok it each time or turn it off by unchecking "Enable enhanced IE security" in the "Display" section of the "Options" dialog.

I'm no expert but this technique should also work in Channel Report, Search Channel and Search Result Templates.


Problem:
"I want to be able to send it via email, ftp, sql database or whatever which I can use the plug ins to accomplish"
By "it" I guess you mean the full page content. By "send it" I'm not sure if you mean automatically "send it" for every item in the channel or on a one-by-one basis after you read the article.

Possible solution:
If "send it" means automatically, use one or more Channel Hooks; if it means one-by-one, use Send to tools. If you are using your own Send to tools then make sure that you have the "Send to" HTML & template parameters defined in your Channel Summary Template (everything between <BEGIN> and <END>). Also if it's the one-by-one Send to approach you want the {%ITEM-METADATA% url encode=sgml chars=<&"} template parameter is exactly what you would pass to your Send to tool to get the URL of the web page referenced by the feed item. If it's the Channel Hook route you want to go then getting the URL would be a little different.


I guess I'd have to say both ways again. I would like to review the articles before sending to mysql or email but I would also like to send it automaticlly. I think I should be able to do this with a channel report but channe report needs to contain the full content of the article.

"The more I try to figure out how to do this the more I realize why it would be so difficult to do from a simple gui standpoint."


You've seen the full text channel hook I previously posted (I just counted, it weighs in at a whopping 73 lines including comments and blank lines. ;-)):
http://www.awasu.com/forums/viewtopic.p ... highlight=

I'm still trying to make time to give it the time it deserves.

That's the first and only Channel hook I've written, I'll write more someday soon. A similar Save As MHT channel hook would be even easier:
Get the URL from Awasu, pass it to GMHT.EXE, get the file named returned by GMHT.EXE, move the file to where you want it or send it via email.

I'll give this a try.

Here's a thought I had too. If it were easier to access all of the linked files this would be a pretty good start. Going to the channel summary to read the offline files isn't effecient. If if you had an option to list all offline files and maybe the option to review and edit those files and then send them to a database or an email this might be a step in the right direction.

btw. does Awasu store the xml files somewhere and where does it store the full linked to pages with the full content?

Now come down off that ledge, before you hurt someone, you've got some Awasu extensions to work on.
Hell, come down off that ledge, give me a week and an extra large Awasu T-shirt and I'll code the "Save as MHT" Channel Hook and Send to tool, for you.
As for your "full web page" templates, that's kind of personal, like your boxer shorts, I think you and I both would prefer you to handle those yourself. :lol:


Hang on I have to go ask my wife if she would be okay with me sending my boxers to some one on the internet. Hmmm. Maybe I'll just send hers, they'll fit in a standard sized envelope. :wink:

Thanks for all your input.

JerryMartin

User avatar
support
Site Admin
Posts: 3073
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Post subject: Re: Full Article Content (Again)

Post by support » Fri Mar 13, 2009 10:43 pm

jerrymartin wrote:what I am after is just the text for the article and not everything else on the page. I would imagin just getting the text for the article would be pretty hard to do but I can hope can't I?

Have you seen the WebScrape plugin?

It only works from a single web page, not linked-to items in a feed but you could still cobble something together that did e.g. write a channel hook that got called every time a new item arrived. generated an INI file that WebScrape.exe can read, then invokde it.

jerrymartin
Posts: 27
Joined: Mon Mar 09, 2009 6:29 pm
Contact:

Re: Post subject: Re: Full Article Content (Again)

Post by jerrymartin » Sat Mar 14, 2009 12:22 am

support wrote:
jerrymartin wrote:what I am after is just the text for the article and not everything else on the page. I would imagin just getting the text for the article would be pretty hard to do but I can hope can't I?

Have you seen the WebScrape plugin?

It only works from a single web page, not linked-to items in a feed but you could still cobble something together that did e.g. write a channel hook that got called every time a new item arrived. generated an INI file that WebScrape.exe can read, then invokde it.


I'll take a look at over the next few days, the first couple of trys didn't work too well.

Thanks for your input.

JerryMartin

User avatar
kevotheclone
Posts: 245
Joined: Mon Sep 08, 2008 7:16 pm
Location: Elk Grove, California

Re: Full Article Content (Again)

Post by kevotheclone » Tue Mar 24, 2009 8:27 am

Hi Jerry, if you're still out there I just posted a "Save as MHT" Send to tool to the "Plugins" forum. You can find it here: http://www.awasu.com/forums/viewtopic.php?p=13973#13973

All of the instructions to set it up are in the post. I wrote it in VBScript because I know the VB family of languages better than I do Python and I knew I didn't have a lot of time right now to write it. Someday after April 15th I'll rewrite it in Python, compile it to an standalone EXE, zip it down with instructions and upload it to Awasu's Wiki, but for now it's functional if you take the time to copy and paste the relevant text out of the post and into a text file with a VBS file extension.

"I hope I didn't imply that you were slackin"

I didn't take offense at your comment, but since you're new here I wanted you to understand that unless the poster's name is "support" it's just another user like you.

"I have a two year old and a wife"

Congratulations! You don't just know busy, you ARE busy. If you live in the US, you might be interested in this feeds listed here: http://www.cpsc.gov/cpscpub/prerel/prerel.html and http://www.fda.gov/oc/rss/

By the way, if you ever need to explain to your wife why you spent your hard-earned money on the Advanced or Professional version of Awasu, it was so you could more closely monitor the recalled food and toys listed on these feeds using Awasu's advanced search capabilities and send text messages to your phone using Awasu's Email Channel Hook. You were only thinking of your family's well being; purchasing Awasu was an investment in your family's future.

"...some bodies out that there that plan to keep the product moving forward and making it better..."

I've got some things in the works that I will release as open source by this summer, if not sooner. Whether anyone finds them useful or not is another story. Ideally when I release something I want to have enough documentation that it doesn't become a burden on Taka to support, and I'd like to be able to monitor the forums and support it myself, that's why right now is not a good time.

I'll try to modify this Send to tool into a Channel Hook some time soon, all I need to do if change the way that it receives the parameters.

Post Reply

Return to “Awasu - General Discussion”