So in short my case is this:
- Read data from RSS feed
- Print content to the terminal
And of course the content isn't in plain ascii, it's utf-8, so I get characters like "öäå". But when I print the text it's all mangled up with ecapes like '\xe4'. Something to do with the encoding but I just can't get my head around this. This should be so trivial to do yet google fu is letting me down.
One example is when I'm going through the content word by word and trying to find the character "ö": I do:
if u"ö" in word:
Which just gives: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6...
Edit:
So I think I found my problem. I was getting the feed items then just doing str(entry.content) and passing that onwards, but that entry.content was a list holding a dictionary with unicode strings as values, so what I did (I guess) was just getting an ascii representation of the dictionary content...