It is currently March 29th, 2024, 10:22 am

Reader

Discuss the use of Lua in Script measures.
User avatar
KreAch3R
Posts: 608
Joined: February 7th, 2011, 7:27 pm
Location: Thessaloniki, GR

Re: Reader

Post by KreAch3R »

Kaelri wrote: Yes, this should be doable once I'm able to parse the dates. All of the items from each feed are stored in one big table, so it's very easy to manipulate them however you want.
That's what I thought. I'm looking forward to that part of the code. :) I imagine it would need some clever table manipulations.
Kaelri wrote: Yeah, that's a challenge, to say the least. One of the flaws with this approach is that DecodeCharacterReference is applied before the feed is passed to Lua, so I can't tell whether an HTML tag - like <item> or <entry> - is part of the content or the actual markup. That makes it much more difficult to reliably detect the feed format.
I see. It's kind of a loop. You need to parse the content first, to find (e.g.) if someone has written "<item>" in a Feed title, substitute that out, and then correctly identify the type, and in order to do that you need to have identified the type first.

I guess you could create your own DecodeCharacter function, but either it would be huge or wouldn't be a universal one, and that's what you 're aiming at.
Inactive, due to life changes. Send me a PM for any question.

Desktop DeviartArt
Image
User avatar
thatsIch
Posts: 446
Joined: August 7th, 2012, 9:18 pm

Re: Reader

Post by thatsIch »

afaik the real body of a feed is always like <item> and never like <item>. The only spot where it can happen is in content-area where people are able to put in there own content to ensure the display of all the information.
User avatar
KreAch3R
Posts: 608
Joined: February 7th, 2011, 7:27 pm
Location: Thessaloniki, GR

Re: Reader

Post by KreAch3R »

thatsIch wrote:afaik the real body of a feed is always like <item> and never like <item>. The only spot where it can happen is in content-area where people are able to put in there own content to ensure the display of all the information.
That is true, and that is exactly what Kaelri was talking about. Using DecodeCharacterReference on the Webparser measure makes these two indistinguishable, as, when the Webparser content reaches Lua, all HTML entities have been already replaced.
Inactive, due to life changes. Send me a PM for any question.

Desktop DeviartArt
Image
User avatar
thatsIch
Posts: 446
Joined: August 7th, 2012, 9:18 pm

Re: Reader

Post by thatsIch »

KreAch3R wrote: That is true, and that is exactly what Kaelri was talking about. Using DecodeCharacterReference on the Webparser measure makes these two indistinguishable, as, when the Webparser content reaches Lua, all HTML entities have been already replaced.
so just dont use it?
if you want to make a universal feedreader there is no other way then implement your own function alone on fact: UTF8 to ANSI. Lua handels such works very effectivly.
Every loop is split into simulatious threads to be calculated.

Mh on second thought, now I know what you mean (where the problem lies) You mean something like

Code: Select all

<item>
<item>
...
</item>
</item>
?

I'm not so sure if programmers need to safeguard everything stupid a user does.
Even though you can distinguish both by the Feed Standard which tells to use the CDATA Tag to tell the Feedreader, that pure HTML is following like

Code: Select all

<item>
...
<content>
<![CDATA[
<item>
...
</item>
]]>
</content>
</item>
imho you can catch this case but else drop it like its hot
User avatar
Kaelri
Developer
Posts: 1721
Joined: July 25th, 2009, 4:47 am

Re: Reader

Post by Kaelri »

KreAch3R wrote:That's what I thought. I'm looking forward to that part of the code. :) I imagine it would need some clever table manipulations.
You'd be surprised. Like I mentioned before, it's just tables within tables. The main "Feeds" table contains a table for each feed, and each feed contains a table for each item. It's extremely easy to navigate. For a random example, if I wanted to get the title of the 5th item in the 2nd feed, I just use Feeds[2][5]['Title']. So to dump all items from all feeds into one table is as simple as:

Code: Select all

AllFeedItems = {}

for _, a in ipairs(Feeds) do
    for __, b in ipairs(a) do
        table.insert(AllFeedItems, b)
    end
end
Then, once I have the dates parsed into UNIX timestamps, sorting the table will be as easy as:

Code: Select all

table.sort(AllFeedItems, function(a,b) return a[Date] > b[Date] end)
thatsIch wrote:I'm not so sure if programmers need to safeguard everything stupid a user does.
Unfortunately, if the script only supported correctly-formatted feeds, it wouldn't be very useful in a practical context. :)

The method I'm using now is to collapse all of the <item> and <entry> tags. In other words, I find the first opening <item> tag, then the last closing </item> tag, and remove everything in between. Then I do the same for the first <entry> and the last </entry>. This lets me see only what is "outside" of the content containers.

Code: Select all

for _, v in ipairs{ 'item', 'entry' } do
    s = string.gsub(s, '<'..v..'.->.+</'..v..'>', '<'..v..'></'..v..'>')
end
User avatar
Mordasius
Posts: 1167
Joined: January 22nd, 2011, 4:23 pm
Location: GMT +8

Re: Reader

Post by Mordasius »

Kaelri wrote:Then, once I have the dates parsed into UNIX timestamps, sorting the table will be as easy as:

Code: Select all

table.sort(AllFeedItems, function(a,b) return a[Date] > b[Date] end)
... which will be all fine and dandy until January 19, 2038 whereupon anyone still using a 32-bit system will need to find a small bucket to catch the stack overflow. :lol:
User avatar
Kaelri
Developer
Posts: 1721
Joined: July 25th, 2009, 4:47 am

Re: Reader

Post by Kaelri »

Hey, if people are still using Rainmeter in 2038, I'll pay that price. :)
User avatar
KreAch3R
Posts: 608
Joined: February 7th, 2011, 7:27 pm
Location: Thessaloniki, GR

Re: Reader

Post by KreAch3R »

Kaelri wrote: You'd be surprised. Like I mentioned before, it's just tables within tables. The main "Feeds" table contains a table for each feed, and each feed contains a table for each item. It's extremely easy to navigate. For a random example, if I wanted to get the title of the 5th item in the 2nd feed, I just use Feeds[2][5]['Title']. So to dump all items from all feeds into one table is as simple as:
..
Then, once I have the dates parsed into UNIX timestamps, sorting the table will be as easy as:

If I go past the fact that the "table inception" ( :p ) idea is a very simple but brilliant one, I wasn't at all aware of the table.sort function. It does all the hard work, where I thought that you would have to find the max value of the Dates table and then remove it and then re-find the max value etc. Yeah, I always have the tendency for complicated unneeded stuff. Thanks for another Lua lesson. :)
Inactive, due to life changes. Send me a PM for any question.

Desktop DeviartArt
Image
User avatar
MerlinTheRed
Rainmeter Sage
Posts: 889
Joined: September 6th, 2011, 6:34 am

Re: Reader

Post by MerlinTheRed »

There are a lot of sorting algorithms, and the fastest ones aren't the easiest to understand or implement, so it's nice to have a utility function like that.
Have more fun creating skins with Sublime Text 2 and the Rainmeter Package!
User avatar
Kaelri
Developer
Posts: 1721
Joined: July 25th, 2009, 4:47 am

Re: Reader

Post by Kaelri »

Getting close now. :)
Screenshot30.png
The script is now getting accurate UTC (Unix epoch) timestamps for all formats. I hit it with a battery of 25 feeds, including some with missing dates, Google Calendars with all-day events, and Remember the Milk tasks with all manner of due dates, and it crunched all of them without issue. This means that, as of now, you can use the script to format all dates to your liking, and even distinguish "dates" from "times".

Sorting and merging should be fairly painless to add now, along with a few other bells and whistles I have in mind.

For anyone interested, the current version of the script is here.
You do not have the required permissions to view the files attached to this post.