Page 1 of 1

The real way to check the forum

Posted: Mon Aug 05, 2019 7:35 pm
by Nite Coder
So I decided that it wasn't the Linux way to open a web browser and go the MX Linux forum to check to see the newest posts. I thought it would be much better in the terminal. So I wrote a python script to get the results for me! :happy:

Code: Select all

#!/usr/bin/python3

from bs4 import BeautifulSoup
import requests
import sys
import os
import time

def get_data():
    html = requests.get('https://forum.mxlinux.org')
    parser = BeautifulSoup(html.text, 'html.parser')
    top = []
    for a in parser.find_all('ul', { 'class' : 'topiclist' }):
        for b in a.find_all('li', { 'class' : 'row' }):
            for c in b.find_all('div', { 'class' : 'list-inner' }):
                for d in c.find_all('li'):
                    for anchors in d.find_all('a', { 'class' : 'topictitle' }):
                        top.append(anchors.text.strip())
    return top

def main(argv):
    if len(argv) > 1:
        if argv[1] == '-b':
            while True:
                data = get_data()
                dis = ''
                for d in data:
                    dis += d + '\n'
                os.system('notify-send \'' + dis + '\'') 
                time.sleep(255)
        else:
            for data in get_data():
                print(data)
    else:
        for data in get_data():
            print(data)

try:
    main(sys.argv)
except:
    print('Error thrown... Exiting...')
Does require Beautiful Soup.

Code: Select all

sudo apt install python3-bs4
Coming soon... Using lynx to post on the MX Forum.

Re: The real way to check the forum

Posted: Mon Aug 05, 2019 8:00 pm
by JayM
Don't forget Unanswered Topics.

Re: The real way to check the forum

Posted: Mon Aug 05, 2019 8:09 pm
by Adrian
This is nice! I would like to be able to get results with my account, especially get the "unread posts" search.php?search_id=unreadposts

Re: The real way to check the forum

Posted: Thu Aug 08, 2019 10:43 am
by Nite Coder
So I made some changes and now you can choose which page you want to scrape and which section you want to get. Being able to do at as your user is having some issues at the moment. Once I login I get the wrong page. But you can scrape unanswered topics and I'm not giving up on the user login thing.

If you don't want to copy it below https://github.com/TheNiteCoder/mx-forum-scraper and the file you want is request.py

Code: Select all

#!/usr/bin/python3

import requests
from bs4 import BeautifulSoup
import urllib.parse as urlparse
from urllib.parse import urlencode

forum = 'https://forum.mxlinux.org/'

def get_html(url='', password=None, username=None):
    if password == None or username == None:
        return requests.get(url).text
    session = requests.Session()
    headers = {'User-Agent' : 'Mozilla/5.0'}
    payload = {'username': username, 'password': password, 'redirect':'index.php' , 'login':'Login'}
    r = session.post(forum + "ucp.php?mode=login", headers=headers, data=payload)
    sidStart = r.text.find("sid")+4
    sid = r.text[sidStart:sidStart+32]
    parameters = {'mode': 'login', 'sid': sid}
    r = session.post(url, headers=headers, params=parameters)
    return r.text

def get_url_arg(url, arg):
    parsed = urlparse.urlparse(url)
    return urlparse.parse_qs(parsed.query)[arg][0]

def set_url_arg(url, arg, val):
    url_parts = list(urlparse.urlparse(url))
    query = dict(urlparse.parse_qsl(url_parts[4]))
    query.update({arg : str(val)})
    url_parts[4] = urlencode(query)
    return urlparse.urlunparse(url_parts)

def get_only_pagination_number_buttons(tag):
    if tag.a == None:
        return False
    if not tag.a.has_attr('class'):
        return False
    if not 'button' in tag.a['class']:
        return False
    if not tag.a.has_attr('role'):
        return False
    if not 'button' in tag.a['role']:
        return False
    if not tag.a.has_attr('href'):
        return False
    if tag.a['href'] == '#':
        return False
    if tag.has_attr('class'):
        return False
    return True

# TODO improve merger
class Merge:
    def __init__(self, htmls=[]):
        if len(htmls) < 1:
            self.merged = ''
            return None
        text = ''.join(html for html in htmls)
        soup = BeautifulSoup(text, 'html.parser')
        text2 = ''.join(str(tag) for tag in list(soup.children))
        main_html = list(soup.children)[0]
        main_html.string = text2
        for tag in soup.children:
            if tag is not main_html:
                tag.extract()
        self.merged = str(soup)

class Request:
    def __init__(self, url='', pages=1, username=None, password=None):
        self.html = ''
        self.url = url
        html = get_html(url=self.url, password=password, username=username)
        print(html)
        parser = BeautifulSoup(html, 'html.parser')
        isMultiPage = False
        for pagination in parser.find_all('div', {'class' : 'pagination'}):
            isMultiPage = True
            break
        if not isMultiPage:
            self.text = html
            return None
        max_start = 0
        for pagination in parser.find_all('div', {'class' : 'pagination'}):
            for ul in pagination.find_all('ul'):
                for li in ul.find_all(get_only_pagination_number_buttons):
                    max_start = max(max_start, int(li.find_all('a')[0].string))
                break
        typ = ''
        if url.find('viewtopic') != -1:
            typ = 'topic'
        else:
            typ = 'forum'
        page_count = 0
        start = 0
        htmls = []
        while page_count < pages:
            page_url = self.url
            page_url = set_url_arg(page_url, 'start', start)
            html = get_html(url=page_url, password=password, username=username)
            htmls.append(html)
            page_count+=1
            start+=int(20 if typ == 'forum' else 10)
        merge = Merge(htmls=htmls)
        self.text = merge.merged

def get_all_forumbg(tag):
    if not tag.has_attr('class'):
        return False
    if 'forumbg' in tag['class']:
        return True
    if 'forabg' in tag['class']:
        return True

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('url', help='Url for forum page')
parser.add_argument('--password', help='Password for your forum account')
parser.add_argument('--username', help='Username for your forum account')
parser.add_argument('--section', help='Name of section')
parser.add_argument('--amount', help='Amount of topics your want, default is 20')
args = parser.parse_args()

if args.amount == None:
    args.amount = 20

request = Request(url=args.url, pages=int(int(args.amount)/20), username=args.username, password=args.password)
soup = BeautifulSoup(request.text, 'html.parser')

section_map = {}

current_section = ''

for forumbg in soup.find_all(get_all_forumbg):
    for topiclist in soup.find_all('ul', {'class' : 'topiclist'}):
        if 'forums' in topiclist['class']:
            for topictitle in topiclist.find_all('a', {'class' : 'topictitle'}):
                section_map[current_section].append(topictitle.contents[2].strip())
        elif 'topics' in topiclist['class']:
            for topictitle in topiclist.find_all('a', {'class' : 'topictitle'}):
                section_map[current_section].append(topictitle.string.strip())
        else:
            current_section = topiclist.find_all('div', {'class':'list-inner'})[0].string
            if current_section == None:
                continue
            if not current_section in section_map.keys():
                section_map[current_section] = []

count = 0

if args.section != None:
    if args.section in section_map.keys():
        for item in section_map[args.section]:
            if args.amount != None:
                if count < int(args.amount):
                    print(item)
                    count+=1
    else:
        print('Invalid section')
else:
    for section in section_map.keys():
        for item in section_map[section]:
            if args.amount != None:
                if count < int(args.amount):
                    print(item)
                    count+=1


Re: The real way to check the forum

Posted: Thu Aug 08, 2019 11:38 am
by Nite Coder
A complete version! You can now get results from with your account! It supports returning certain number of topics and which section to chose from.

Code: Select all

#!/usr/bin/python3

import requests
from bs4 import BeautifulSoup
import urllib.parse as urlparse
from urllib.parse import urlencode

forum = 'https://forum.mxlinux.org/'

def get_html(url='', password=None, username=None):
    if password == None or username == None:
        return requests.get(url).text
    session = requests.Session()
    headers = {'User-Agent' : 'Mozilla/5.0'}
    payload = {'username': username, 'password': password, 'redirect':'index.php' , 'login':'Login'}
    r = session.post(forum + "ucp.php?mode=login", headers=headers, data=payload)
    sidStart = r.text.find("sid")+4
    sid = r.text[sidStart:sidStart+32]
    parameters = {'mode': 'login', 'sid': sid}
    res = session.get(url, headers=headers, params=parameters)
    return res.text

def get_url_arg(url, arg):
    parsed = urlparse.urlparse(url)
    return urlparse.parse_qs(parsed.query)[arg][0]

def set_url_arg(url, arg, val):
    url_parts = list(urlparse.urlparse(url))
    query = dict(urlparse.parse_qsl(url_parts[4]))
    query.update({arg : str(val)})
    url_parts[4] = urlencode(query)
    return urlparse.urlunparse(url_parts)

def get_only_pagination_number_buttons(tag):
    if tag.a == None:
        return False
    if not tag.a.has_attr('class'):
        return False
    if not 'button' in tag.a['class']:
        return False
    if not tag.a.has_attr('role'):
        return False
    if not 'button' in tag.a['role']:
        return False
    if not tag.a.has_attr('href'):
        return False
    if tag.a['href'] == '#':
        return False
    if tag.has_attr('class'):
        return False
    return True

# TODO improve merger
class Merge:
    def __init__(self, htmls=[]):
        if len(htmls) < 1:
            self.merged = ''
            return None
        text = ''.join(html for html in htmls)
        soup = BeautifulSoup(text, 'html.parser')
        text2 = ''.join(str(tag) for tag in list(soup.children))
        main_html = list(soup.children)[0]
        main_html.string = text2
        for tag in soup.children:
            if tag is not main_html:
                tag.extract()
        self.merged = str(soup)

class Request:
    def __init__(self, url='', pages=1, username=None, password=None):
        self.html = ''
        self.url = url
        html = get_html(url=self.url, password=password, username=username)
        parser = BeautifulSoup(html, 'html.parser')
        isMultiPage = False
        for pagination in parser.find_all('div', {'class' : 'pagination'}):
            isMultiPage = True
            break
        if not isMultiPage:
            self.text = html
            return None
        max_start = 0
        for pagination in parser.find_all('div', {'class' : 'pagination'}):
            for ul in pagination.find_all('ul'):
                for li in ul.find_all(get_only_pagination_number_buttons):
                    max_start = max(max_start, int(li.find_all('a')[0].string))
                break
        typ = ''
        if url.find('viewtopic') != -1:
            typ = 'topic'
        else:
            typ = 'forum'
        page_count = 0
        start = 0
        htmls = []
        while page_count < pages:
            page_url = self.url
            page_url = set_url_arg(page_url, 'start', start)
            html = get_html(url=page_url, password=password, username=username)
            htmls.append(html)
            page_count+=1
            start+=int(20 if typ == 'forum' else 10)
        merge = Merge(htmls=htmls)
        self.text = merge.merged

def get_all_forumbg(tag):
    if not tag.has_attr('class'):
        return False
    if 'forumbg' in tag['class']:
        return True
    if 'forabg' in tag['class']:
        return True

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('url', help='Url for forum page')
parser.add_argument('--password', help='Password for your forum account')
parser.add_argument('--username', help='Username for your forum account')
parser.add_argument('--section', help='Name of section')
parser.add_argument('--amount', help='Amount of topics your want, default is 20')
args = parser.parse_args()

if args.amount == None:
    args.amount = 20

pages = int(int(args.amount)/20)
if pages == 0:
    pages = 1

request = Request(url=args.url, pages=pages, username=args.username, password=args.password)
soup = BeautifulSoup(request.text, 'html.parser')

section_map = {}

current_section = ''

for forumbg in soup.find_all(get_all_forumbg):
    for topiclist in soup.find_all('ul', {'class' : 'topiclist'}):
        if 'forums' in topiclist['class']:
            for topictitle in topiclist.find_all('a', {'class' : 'topictitle'}):
                section_map[current_section].append(topictitle.contents[2].strip())
        elif 'topics' in topiclist['class']:
            for topictitle in topiclist.find_all('a', {'class' : 'topictitle'}):
                section_map[current_section].append(topictitle.string.strip())
        else:
            current_section = topiclist.find_all('div', {'class':'list-inner'})[0].string
            if current_section == None:
                continue
            if not current_section in section_map.keys():
                section_map[current_section] = []

count = 0

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

# Get rid of duplicates
for key in section_map.keys():
    section_map[key] = f7(section_map[key])


if args.section != None:
    if args.section in section_map.keys():
        for item in section_map[args.section]:
            if args.amount != None:
                if count < int(args.amount):
                    print(item)
                    count+=1
    else:
        print('Invalid section')
else:
    for section in section_map.keys():
        for item in section_map[section]:
            if args.amount != None:
                if count < int(args.amount):
                    print(item)
                    count+=1

Re: The real way to check the forum

Posted: Thu Aug 08, 2019 11:45 am
by asqwerth
I'm always so impressed by people who can fiddle around and come up with things like this just for fun.

Re: The real way to check the forum

Posted: Thu Aug 08, 2019 11:46 am
by Nite Coder
asqwerth wrote: Thu Aug 08, 2019 11:45 am I'm always so impressed by people who can fiddle around and come up with things like this just for fun.
Thank you!

Re: The real way to check the forum

Posted: Thu Aug 08, 2019 12:03 pm
by richb
An interesting option for those who like using the CLI. I would never have thought of that approach.

Re: The real way to check the forum

Posted: Thu Aug 08, 2019 12:06 pm
by asqwerth
I'm sure one could use the script in a conky....

But I haven't even tested the script, so I don't know how it looks. I'll check it out over the weekend.

Re: The real way to check the forum

Posted: Thu Aug 08, 2019 12:12 pm
by richb
One could also set up a desktop or panel launcher

Re: The real way to check the forum

Posted: Mon Sep 23, 2019 2:44 pm
by Nite Coder
richb wrote: Thu Aug 08, 2019 12:12 pm One could also set up a desktop or panel launcher
I thought that was such a good idea that I did it!
Here are some images of how it looks:

Desktop Launchers:
MX Forum Top:

Code: Select all

[Desktop Entry]
Categories=Internet;
Exec=xfce4-terminal -e "mxforum-query https://forum.mxlinux.org/index.php" --hold --title "Top MX Forum Topics"
Name=MX Forum Top
GenericName[en_US]=MX Forum Top
GenericName=MX Forum Top
Icon=/boot/grub/themes/mx_elegant/icons/mx.png
MimeType=
Comment=Show unanswered topics
NoDisplay=false
StartupNotify=true
Terminal=false
TerminalOptions=
Type=Application
MX Forum Unanswered:

Code: Select all

[Desktop Entry]
Categories=Internet;
Exec=xfce4-terminal -e "mxforum-query https://forum.mxlinux.org/search.php?search_id=unanswered" --hold --title "Unanswered MX Forum Topics"
Name=MX Forum Unanswered
GenericName[en_US]=MX Forum Unanswered
GenericName=MX Forum Unanswered
Icon=/boot/grub/themes/mx_elegant/icons/mx.png
MimeType=
Comment=Show unanswered topics
NoDisplay=false
StartupNotify=true
Terminal=false
TerminalOptions=
Type=Application
MX Forum Unread: Notice it needs your username and password

Code: Select all

[Desktop Entry]
Categories=Internet;
Exec=xfce4-terminal -e "mxforum-query https://forum.mxlinux.org/search.php?search_id=unreadposts --username 'Your Username' --password 'Your Password'" --hold --title "Unread MX Forum Topics"
Name=MX Forum Unread
GenericName[en_US]=MX Forum Unread
GenericName=MX Forum Unread
Icon=/boot/grub/themes/mx_elegant/icons/mx.png
MimeType=
Comment=Show unread topics
NoDisplay=false
StartupNotify=true
Terminal=false
TerminalOptions=
Type=Application
https://i.imgur.com/fn99uNS.png
https://i.imgur.com/JttDrsi.png
asqwerth wrote: Thu Aug 08, 2019 12:06 pm I'm sure one could use the script in a conky....

But I haven't even tested the script, so I don't know how it looks. I'll check it out over the weekend.
I tried, it doesn't look good. but it is functional.

https://i.imgur.com/e5z5Avc.png

Re: The real way to check the forum

Posted: Mon Sep 23, 2019 4:52 pm
by richb
Nice!

Re: The real way to check the forum

Posted: Mon Sep 23, 2019 5:00 pm
by SwampRabbit
That is really awesome.

I know this was a fun "pet project", but this would make for a really cool xfce panel plugin.

Re: The real way to check the forum

Posted: Mon Sep 23, 2019 5:19 pm
by richb
Another GUI method that will popup the results in a browser tab or open the browser and display the results.
1. Click on quick links and click to the post type you want, Unread, New , etc.
2. Drag the address from the browser address bar to the desktop. You will be asked to create a link
3. It will appear on the desktop
4. Click on the Link. You will get a message that it is an untrusted link launcher, You trust it because you made it. Click Mark Executable.

You can now click on it to access the function you chose.For example Unread Posts.
You can further add it to the panel or an auxiliary panel by right clicking on it and choosing Open with create a launcher on the panel"

Is it easier to access this way than with the Index page of the browser? Maybe not, but shows the abilities of XFCE.

This was doen in MX 19 Beta 2. Probably will work in MX 18 as well.

After all that a simpler method: Just create a launcher directly on the desktop, right click>Create a URL Link, or panel by providing the appropriate url in the launcher field.

Re: The real way to check the forum

Posted: Mon Sep 23, 2019 5:55 pm
by Adrian
Also, another idea (I saw you used that in your first versions of the script) if you use Xfce menu items maybe it would be best to pop up a notify-send message instead of terminal.

Re: The real way to check the forum

Posted: Fri Nov 01, 2019 6:15 pm
by MichaelPV
I've entered my username and password where it says "none" in the beginning of the script and I ran the script by entering "python MXforumscript". I get a SytnaxError: invalid syntax. My password has a colon in it and there was an arrow pointing to the colon in the error message. Do you know what I'm doing wrong?

Re: The real way to check the forum

Posted: Mon Nov 04, 2019 10:29 pm
by Nite Coder
Try putting single or double quotes around your password

Code: Select all

./mxforum-query --password 'password:with:many:colons' --username 'someusername' https://forum.mxlinux.org

Re: The real way to check the forum

Posted: Mon Nov 04, 2019 11:25 pm
by BitJam
It would be handy to have an easy way to post to the forums from the command line. This would help people when X stops working or does not work. Ideally, quick system info could be easily added to the post.

Re: The real way to check the forum

Posted: Tue Nov 05, 2019 6:53 pm
by Nite Coder
Hmm. Cool idea! I'll look into it. Tricky thing is the testing. Got to actually try posting

Re: The real way to check the forum

Posted: Tue Nov 05, 2019 9:38 pm
by Nite Coder
Adrian wrote: Mon Sep 23, 2019 5:55 pm Also, another idea (I saw you used that in your first versions of the script) if you use Xfce menu items maybe it would be best to pop up a notify-send message instead of terminal.
I think you are right about that one, don't know want inspired me to use the terminal but here is an example desktop file

Code: Select all

[Desktop Entry]
Categories=Internet;
Exec=bash -c 'notify-send "$(/path/to/mxforum-query https://forum.mxlinux.org)"'
GenericName[en_US]=Forum
GenericName=Forum
Icon=mxfcelogo-rounded
MimeType=
Name=MX Forum
Comment=MX Forum
NoDisplay=false
Path=
StartupNotify=true
Terminal=false
TerminalOptions=
Type=Application
X-DBUS-ServiceName=
X-DBUS-StartupType=
X-KDE-SubstituteUID=false
X-KDE-Username=

Re: The real way to check the forum

Posted: Wed Mar 25, 2020 9:02 pm
by Nite Coder
New Forum script requiring python3-bs4, python3-soupsieve, python3-urwid
This script has a ncurses interface and you can view and browse both subforms and topics
Script Download: https://raw.githubusercontent.com/TheNi ... rum-tui.py

Re: The real way to check the forum

Posted: Wed Mar 25, 2020 10:44 pm
by JayM
I installed python3-bs4, python3-soupsieve and python3-urwid from the MX stable repo yet when I run the script I still get

Code: Select all

$ python ./mx-forum-tui.py
Traceback (most recent call last):
  File "./mx-forum-tui.py", line 11, in <module>
    import urwid
ImportError: No module named urwid
I did a Catfish search for urwid in my filesystem with no results found but MXPI shows that it's installed, so apparently the module is actually called something different than the package's name.

Re: The real way to check the forum

Posted: Wed Mar 25, 2020 10:45 pm
by JayM

Code: Select all

$ apt show python3-urwid
Package: python3-urwid
Version: 2.0.1-2+b1
Priority: optional
Section: python
Source: urwid (2.0.1-2)
Maintainer: Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>
Installed-Size: 938 kB
Provides: python3.6-urwid, python3.7-urwid
Depends: python3 (<< 3.8), python3 (>= 3.6~), python3:any (>= 3.3.2-2~), libc6 (>= 2.4)
Suggests: python-urwid-doc (>= 2.0.1-1)
Homepage: http://urwid.org/
Download-Size: 174 kB
APT-Manual-Installed: yes
APT-Sources: http://mirror.pregi.net/debian buster/main amd64 Packages
Description: curses-based UI/widget library for Python 3
 Urwid is a console user interface library that includes many features
 useful for text console application developers including:
 .
  * Fluid interface resizing (xterm window resizing/fbset on Linux console)
  * Web application display mode using Apache and CGI
  * Support for UTF-8, simple 8-bit and CJK encodings
  * Multiple text alignment and wrapping modes built-in
  * Ability to create user-defined text layout classes
  * Simple markup for setting text attributes
  * Powerful list box that handles scrolling between different widget types
  * List box contents may be managed with a user-defined class
  * Flexible edit box for editing many different types of text
  * Buttons, check boxes and radio boxes
  * Customizable layout for all widgets
  * Easy interface for creating HTML screen shots
 .
 This is the Python 3 version of the package.

Re: The real way to check the forum

Posted: Sun Mar 29, 2020 3:25 pm
by Nite Coder
It is probably because python refers to python2 and you installed python3-urwid and that is why it can't find the module. So just specify python3

Re: The real way to check the forum

Posted: Sun Mar 29, 2020 10:29 pm
by JayM

Code: Select all

python3 ./mx-forum-tui.py
That worked. Thanks. :)

Viewing the results makes me wonder if I can find FTP sites for Linux with Archie then use the cli ftp client to get 'em. Hmm, I'll have to install lynx so I can search the web and find out, unless there's some info in gopher.

Re: The real way to check the forum

Posted: Wed Jun 03, 2020 12:48 pm
by confetti
I love dipping into the Forum for ideas and feedback, but I must admit that I am often overwhelmed by it, and am not making the most efficient use of it.

I would like to suggest that one of the "MXperts" make a video that breaks it down into easy parts for people like me to understand. It could show how to search, post, or even make a terminal reader as is being discussed here.

I think one of the secrets to the success of MX Linux has been the brilliance of all the videos that so ably walk us through the basics and point us to greater creativity!