awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Any Python programmer want collaboration on Awasu?

Post by awasu.user » Sun Dec 29, 2019 4:56 pm

As in title. I love both Python and Awasu. I'm looking for someone interesting in join force to create as hobby project (unpaid) two things using Python:

webapi interface
feed generator ecosystem (plugins)

for Awasu. Target it that it will be available free of charge for anyone interested. I have some base, but from personal (family) question I do not have time to polish and improve.

What I have?

Web GUI:
Is based on Flask. From Awasu is generated file loaded to it. Using API calls it get extra information needed to get folder informations. For operations using as backend Pandas and now I'm going to add multithreads to loading. When you browse articles grouped like in Awasu by folder you can add them to list, which you can save on disk.

Feed generation ecosystem:
I have asyncio (aiohttp) skeleton to get data and synchronious skeleton of library to use it based on requests. First sync version of library has autodetection articles to get data like title, publishing date etc.

Is any one interested in colaboration? If you have question ask free.

User avatar
support
Site Admin
Posts: 3064
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Any Python programmer want collaboration on Awasu?

Post by support » Mon Dec 30, 2019 2:20 am

This sounds great :clap:

Have you put the source code up online somewhere e.g. Github? That will make collaboration much easier. I know kevotheclone would be very interested in taking a look at what you've done...

Also, what do you want to do with this e.g. what new features do you have in mind?

awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Re: Any Python programmer want collaboration on Awasu?

Post by awasu.user » Mon Dec 30, 2019 10:00 am

I'm going to add code to Github. I have to change hard coded path to seperate config and test, that is working fine after update to python 3.8.1 and change OS to Win 10. I will be notice in this thread.
support wrote:
Mon Dec 30, 2019 2:20 am
Also, what do you want to do with this e.g. what new features do you have in mind?
It is really good question. All depends what someone other need.

At the basic plan I want create simple analytics board. From Awasu you get data and using your webbrowser you can browse channels and articles (it's implemented). You can use special Awasu feature Search Agents to get articles on interesting you topics (It is done too if you followe with convention of naming Search Agents). The harder parts it's add real time search. Now it is implemented and based on JS. So you can simply change one line of code and get filtering in JS library which I use. Better will be using something like Elasticsearch and add possibility using advanced operator like and or and filtering based on categories (folder) from Awasu.

What is missing and I want add is auto generate list of hot topics with list of new articles. It will be amazing. You are add your feeds and get what is the most popular in them not in hole web, indepent from gigant like Google. You choose what you like and you have what is going on yours channels. Another thing is add trend - what is change from last visit or maybe from older time - it's to discuss. Another step is add sugestion for articles. You select interesting one and application suggest what you could be interesting for your based on what you check as attractive for you.

Other things is add data to Awasu using channel generation framework. I found out a lot of interesting me source, but it needed here scraping and coding data. I want make it simple by defining what you want and get feed as result and maybe some offline version. My first implementation was with creating ebook for Kindle and other ebook readers. I think about use it as system to archive data and make it ready for offline reading.

At the summary.

I have a lot preparation for this project. I learn from scratch Python, start with basic about machine learning, bougt books, learn a lot of and I think that in some place in future I can finish all it alone, but times matters. I think other point of view and cooperation can be good for all.

Now I'm after crash on my PC. I lost some part of my work and I'm repearing what I can. It's motivated me to show what I have to publicity.

User avatar
support
Site Admin
Posts: 3064
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Any Python programmer want collaboration on Awasu?

Post by support » Mon Dec 30, 2019 12:27 pm

This is all very cool. One of the key features of Awasu's design is its extensibility, to make it easy to build things like this on top of it, so it's great that you've done this.

I'm going to be very busy over the next few months on another project, but I'm to happy review what you've done and make suggestions, and I'm sure kevotheclone will be interested in participating as well. Get it up on Github, and we can take it from there!

awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Re: Any Python programmer want collaboration on Awasu?

Post by awasu.user » Mon Dec 30, 2019 2:09 pm

So let's begin. It is project GitHub https://github.com/js-compilatrum/awasu_news_browser. I made same changes, because I have hard coded paths. On readme I wrote how start with it. I have some CPython issue in my IDE and for now I'm not using env, because I can install pandas inside it. I have to use global available packages, so I fill requirements.txt manually. It should not make difference. I lost some part and I have to add from scratch type hints and revisied code. So it is working what I've done. T

Today I tested app on limited portion my RSS feeds. It about 4500 channels and it's looks ok.

Available options:
ALL - browse all channels
FAV - load channels from favorite list "ch_list_process.txt"
SA - Search Agents data
Q10 / Q25 - Limit articles per channel 10 / 25

When click on row article is added to selected. From selected articles you can save them to file, browse on new tab or remove from list. Templates is hard coded for Full HD 15" on my laptop. So base skeleton exists. Some modification are needed.

User avatar
support
Site Admin
Posts: 3064
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Any Python programmer want collaboration on Awasu?

Post by support » Wed Jan 01, 2020 8:55 am

I've started to take a look at this. Here's some feedback on what I've done so far:

(*) The command to install requirements is:

Code: Select all

pip install -r requirements.txt
(*) json is a built-in module, and doesn't need to be in requirements.txt.

(*) I am in the camp that believes that third-party modules should be pinned to a specific version. If you just specify module names in your requirements.txt, pip will install the latest versions of everything, which your code may or may not work with. If you explicitly specify versions for each dependency, people will get the versions that you have, and have tested against. Run pip freeze to find out what versions of everything you have.

(*) pip wasn't able to install numpy (as a dependency of pandas). This appears to be a bug in pip, I worked around this by including numpy in requirements.txt:

Code: Select all

numpy==1.18.0
(*) In the read-me, the section that talks about config.py has formatting problems (missing newlines). It's not a bad idea to include an example file called (say) config.py.example, and tell people to rename it and edit it.

(*) At manipulators.py:65, the path to Awasu's config file is hard-coded. The correct way to locate this file is to call $/userInfo, and you will get the location of all the user's data files.

(*) Awasu always uses UTF8, so when you open the config file, include a encoding="utf-8" parameter.

(*) At manipulators.py:292, the Awasu API token is hard-coded. You should probably also make the http://localhost:2604 bit configurable (like LOCALHOST_URL), since the user can change the port number.

(*) You are running a report called ALL_CSV. This report name should be configurable. The code gets stuck in an infinite loop, probably because I don't have this report. Since there don't seem to be any instructions on how this report should be set up, I am now stuck.

User avatar
support
Site Admin
Posts: 3064
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Any Python programmer want collaboration on Awasu?

Post by support » Wed Jan 01, 2020 9:27 am

Here's a bit of code that will make things easier. It's a function that simplifies calling the Awasu API:

Code: Select all

import requests
import json

AWASU_API = "http://localhost:2604" # nb: get this from config.py
TOKEN = "abc123"

def call_awasu( api_name, **kwargs ):
    """Call the Awasu API."""
    
    # prepare the arguments
    kwargs["format"] = "json"
    if TOKEN:
        kwargs["token"] = TOKEN
    post_data = kwargs.pop( "post_data", None )
        
    # send the request
    url = "{}/{}".format( AWASU_API, api_name )
    if post_data:
        print( "POST:", url, kwargs )
        resp = requests.post( url, params=kwargs, data=post_data )
    else:
        print( "GET:", url, kwargs )
        resp = requests.get( url, params=kwargs )
    return json.loads( resp.text )

# simple call        
print( call_awasu( "buildInfo" ) )
print()

# call with arguments
print( call_awasu( "search/query", query="foo bar", max=3 ) )
print()

# call with POST data
print( call_awasu( "channels/create", post_data="""
    <channel type="standard">
        <feedUrl> https://awasu.com/news.xml </feedUrl>
    </channel>
""" ) )
This makes it much easier to locate the user's config file. During startup, get the user's info:

Code: Select all

user_info = call_awasu( "userInfo" )[ "userInfo" ] 
Then it's easy to find out where their config file is:

Code: Select all

print( user_info[ "userConfigFile" ] )

awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Re: Any Python programmer want collaboration on Awasu?

Post by awasu.user » Wed Jan 01, 2020 10:33 pm

Thank you for your suggestions. Some parts of this code are two years old and have to be polished. I've on midtime very hot time so I will be simply add fast solution how start it here before I will be add it to githib. On next few days I will be update repo to add more information and add sugestions.
awasu.user wrote:
Mon Dec 30, 2019 2:09 pm
(*) You are running a report called ALL_CSV. This report name should be configurable. The code gets stuck in an infinite loop, probably because I don't have this report. Since there don't seem to be any instructions on how this report should be set up, I am now stuck.
ALL_CSV:
Template: https://github.com/js-compilatrum/awasu ... N.template

Channel filter: all
check Group items by channels
Include items: Unread items only

Outputfile: fx. D:\unread.json

Config.py

Code: Select all

EXIST_DATA = True
PUB_DIR = r'D:\[repo_copy_location]\static\pub'  # Where save selected articles by Flask
NEWS_PER_PAGE = 500 # articles per page
LOCALHOST_URL = "http://127.0.0.1:5000" # Use default from Flask serving or unicorn

BASE_SETTINGS_DIR = r'D:\settings'  # Settings dir
'''
Create at this dir txt file fx. ch_list_process.txt and put names of channels to use fx. 

Box Office Mojo - Current Box Office Results
Box Office Mojo - Top Stories
CNS Movie Reviews
ComingSoon.net
Latest Movie Metascores on Metacritic
Movies – NewNowNext
New Movies In Theaters - IMDb

'''
CONCEPT_DEBUG = False # Debug, set True to use CSV_TEST_DATA
CSV_BASE_SOURCE = r'D:\unread.json'  # Here ara articles saved from Awasu - copy value from report config in GUI
CSV_TEST_DATA = r'D:\concept_debug.json  # Here are test data 
OUTPUT_DIR = r'D:\selected_articles'  # Where save selected articles as txt and csv
# Security
TOKEN = "abc123"
ch_list_process.txt - file with names of channels in Awasu (Properties > Details > Name) to open favorite channels.

Problem with template
I use

Code: Select all

"published": "{%ITEM-METADATA% timestamp noCaption}",
to get data without format hour only, yesterdat etc. and it is not working. Using timestamp or date in one format will be simplify selecting by dates.

awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Re: Any Python programmer want collaboration on Awasu?

Post by awasu.user » Wed Jan 01, 2020 10:37 pm

support wrote:
Wed Jan 01, 2020 9:27 am
Here's a bit of code that will make things easier. It's a function that simplifies calling the Awasu API:
It's very good idea. When I start hardcoded URL was simpler, but it have to change as I make some work around. For example creating directory tree for channel (when I select some folder I can get all children) it have to be updates as has mismatched, but it is one of the a lot for working subjects here.

Information
It can be some clue for working on my application. I'm checking in Awasu logs and use API to generate reports. I can find out how check update status from Awasu like pending updates, finished updates (idle time) so the easiest is wait when Awasu do stuff and click GUI button. I use update when I work some long time to update data.

It's not corrected naming, because suggest CSV, but my changes from this week are json based only.

User avatar
support
Site Admin
Posts: 3064
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Any Python programmer want collaboration on Awasu?

Post by support » Fri Jan 03, 2020 12:41 pm

With the ALL_CSV report created, the webapp now starts up, and it looks pretty good :-)

Some more comments:

(*) If the CSV_TEST_DATA file doesn't exist, the program gets into an infinite loop.

(*) Clicking on the [SA} click causes the server to crash. In show_page(), articles_data is [], which means indexes becomes [], and we crash on:

Code: Select all

end_loc = indexes[len(indexes) - 1]
(*) Don't auto-launch a new browser. It's very annoying to have a new browser window open every time I make a change and restart the server.

awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Re: Any Python programmer want collaboration on Awasu?

Post by awasu.user » Fri Jan 03, 2020 11:44 pm

support wrote:
Fri Jan 03, 2020 12:41 pm
(*) Clicking on the [SA} click causes the server to crash. In show_page(), articles_data is [], which means indexes becomes [], and we crash on:
It's problem with my type of work. I have always Search Agents. You have to use convention "M > [name of search agent]" to make it's works. It's not implementing aditional functionality, but it's basic tools for get selected articles by Search Agents.

Now without agents should be working. I switch off defualt that data are available to check it. (For my workflow I has always some data so it is not a problem, and it is why I setup EXISTS_DATA in config.
I'm starting implementing TDD for my code.
support wrote:
Fri Jan 03, 2020 12:41 pm
(*) If the CSV_TEST_DATA file doesn't exist, the program gets into an infinite loop.
CONCEPT_DEBUG True / False add to config.py control this behaviour.

User avatar
support
Site Admin
Posts: 3064
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Any Python programmer want collaboration on Awasu?

Post by support » Sat Jan 04, 2020 1:48 am

awasu.user wrote:
Fri Jan 03, 2020 11:44 pm
You have to use convention "M > [name of search agent]" to make it's works.
I had tried that, but it still crashed.

Is there some reason for this convention i.e. why wouldn't someone want all their search agents available?

Maybe you could have something like this:

Code: Select all

SEARCH_AGENT_PREFIX = "M >"
to mean that only search agents that start with "M >" will be included. By default, this should be the empty string, so all search agents will be included. But if someone wants to restrict what is included, then they can set it.
awasu.user wrote:
Fri Jan 03, 2020 11:44 pm
For my workflow I has always some data so it is not a problem, and it is why I setup EXISTS_DATA in config.
Things should work "out of the box" i.e. I do a git clone, run it, and everything should work. If possible, you shouldn't need to set a flag to indicate that there is no data, just check if the file is there before reading it, and if not, start up with an empty data set. Maybe print a warning saying that the file is missing, please run the report.

awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Re: Any Python programmer want collaboration on Awasu?

Post by awasu.user » Sat Jan 04, 2020 9:43 pm

support wrote:
Sat Jan 04, 2020 1:48 am
I had tried that, but it still crashed.
Try;
1. Ctrl+N
2. Generated from search results
3. Search for: "Africa or Europe"
4. Name: "M > Continents"
5. Next to finish adding.
support wrote:
Sat Jan 04, 2020 1:48 am
Is there some reason for this convention i.e. why wouldn't someone want all their search agents available?
You always get what you want, included Channels. Remember that I open closed and used by me project and I make some shortcuts. Primal target here is the shortest possible time to achieve results the simplest way. To reduce API call which slow down show results convention is by naming, but you do not have to. You have 3 options here:

1. By names starts with prefix "M >" (as I described above)
2. Put all search agents in folder fx. My Search and when you run all should will get there when you click "My Search" menu option
3. Setup in config.py:

Code: Select all

 BASE_SETTINGS_DIR = r'D:\py-module\settings'
put names of search agents channels to text file named "D:\py-module\settings\searchagents.txt" and in file base.html add in {% macro generate_nav(menu_folders) %} position:

Code: Select all

<a href="{{ url_for('show_page', mode='favorite', option='searchagents', page=1) }}" class="btn_root">[Search Agents]</a>
support wrote:
Sat Jan 04, 2020 1:48 am
Things should work "out of the box" i.e. I do a git clone, run it, and everything should work.
And that is it. I'm opening my code to get basic configuration. Some shortcust are made for simplicity or as development future window. Exist_data variable is added as door for implementing any type of checking fx. you can get correct data, but are too old. By setting this to Off (False) you do not show them and speed working with only fresh news. Whole design is oriented at the one principle - the shortest time to view data to browse. It's not project for leisure, but for work. I use it and it's job done. With all this quirks and freaks it can get you money back if you understand how use it. For me it's reduce some work from about 6-8 hours to 10-25 minutes.

When I see questions and advice from outside I see where someone has another vision and way of using it. For example is simpler get data and work with them outside Awasu than add custom Search Agents based on current data any time when running as is in longet term counterintuitive and not productive as it is too much delayed in "freshness". Reading Awasu files are faster than make hundreads API calls fx. ~5 000 async API calls with:

Code: Select all

$/channels/get
to get more detailed information about each channel. From starts I go away this as I think that will be crash Awasu from start.

Some questions are add things to design. All your suggestion I try use to support my next steps in design.

This is old code. I start working with it after few video hours of python course. I need all for resolve questions to desing and sometimes I have beginner and not finished branches.

User avatar
support
Site Admin
Posts: 3064
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Re: Any Python programmer want collaboration on Awasu?

Post by support » Sun Jan 05, 2020 7:56 am

No worries. Not criticizing, just making suggestions. This is pretty impressive for a first program, but if you want other people to work on it, you can't have too many of these special things that need to be set up for it to work. I had to do a bunch of debugging to figure out what needed to be done, but most people won't bother. If it doesn't work out of the box, they'll just walk away and do something else.
awasu.user wrote:
Sat Jan 04, 2020 9:43 pm
1. Ctrl+N
2. Generated from search results
Ah, this is a search channel, not a search agent :-) But I still get a crash.
awasu.user wrote:
Sat Jan 04, 2020 9:43 pm
Reading Awasu files are faster than make hundreads API calls fx. ~5 000 async API calls with:

Code: Select all

$/channels/get
to get more detailed information about each channel.
Why are you calling $/channels/get? This returns the HTML page for a channel. Did you mean $/channels/list? This provides information about each channel, but you can provide a list of channel ID's, or "*" to mean all channels (which will only be a single API call).

awasu.user
Posts: 105
Joined: Fri Jan 06, 2017 12:50 pm

Re: Any Python programmer want collaboration on Awasu?

Post by awasu.user » Sun Jan 05, 2020 10:26 pm

G
support wrote:
Sun Jan 05, 2020 7:56 am
No worries. Not criticizing, just making suggestions
I'm very appreciated. I code it for myself and at this stage some things should be done to add more flexibility. I only think that better is at this stage code all project from scratch or better update existing structure. What do you think?
support wrote:
Sun Jan 05, 2020 7:56 am
Ah, this is a search channel, not a search agent But I still get a crash.
From begginning they are too similar to me to find out difference... Could you explain me crash in more details?
awasu.user wrote:
Sat Jan 04, 2020 9:43 pm
Why are you calling $/channels/get? This returns the HTML page for a channel. Did you mean $/channels/list? This provides information about each channel, but you can provide a list of channel ID's, or "*" to mean all channels (which will only be a single API call).
It is what I start at the beginning, but I change my mind. Now I think about create API class to create better calling experience.

Post Reply

Return to “Awasu - Extensions”