martymarty wrote:Does that mean I can ignore them?
Most of the time, you can strip these off and ignore them, because most of the time, it makes no difference. The problem is, the few times when it does
make a difference, it's impossible to figure out the right thing to do without this information.
Certain characters are important and have a special meaning in HTML e.g. &. It's used to write other special characters e.g. < if you want a < to appear. And if you want to write an &, you have to write &
The problem is, when somebody writes something like "Ben & Jerry's", did they actually want an &, or was it part of a special HTML sequence? In this particular example, it's easy to tell that the author wanted an &, but say someone is writing an article about HTML (like this one
), and they write "&" - did they want an & to appear, or did they want those 5 characters to appear verbatim (e.g. as in "if you want an ampersand to appear, you must write &"
). There's no way to tell.
Atom (the successor to RSS) handles this by requiring that all content be declared as HTML or plain-text. If something is HTML, an & is assumed to be a special HTML sequence (as in &), if it's plain-text, ampersands are interpreted as being ampersands. RSS doesn't have this concept, it just has "text", so every time Awasu sees an ampersand, it has to guess what the author's intent is (and sometimes it guesses wrong - you will occasionally see ampersands go missing in the content in Awasu).
The TL;DR is you should really consider whether content is HTML or plain-text, but most of the time, you can get away with not worrying about it.
martymarty wrote:What did you mean by battleground?
RSS was invented by a guy called Dave Winer, who was, shall we say, a little lenient when it came to specifying exactly how RSS should work, thus giving rise to problems like the one I described above. He is also a little combative, and tended to rub people the wrong way, which didn't help. So a bunch of guys got together to devise a successor to RSS, called Atom. It has been accused of being complex and over-engineered, but by and large, it works well and fixes the problems RSS had.
If you're interested, you can find out more here
. This is all ancient history, BTW...