Python U In Front Of String

I have a script that reads in a webpage and also provides Beautiful Soup to parse it. From the soup I extract all the web links as my final goal is to print out the link.contents.

You watching: Python u in front of string

All of the message that I am parsing is ASCII. I understand that Python treats strings as unicode, and also I am certain this is exceptionally handy, just of no use in my wee script.

Eexceptionally time I go to print out a variable that holds "String" I gain printed to the display. Is tright here a straightforward way of acquiring this earlier into simply ascii or need to I create a regex to strip it?


python unicode ascii
Share
Improve this question
Follow
edited Apr 14 "16 at 11:21
*

Freek de Bruijn
3,41422 gold badges2222 silver badges2626 bronze badges
asked Mar 1 "09 at 10:48
*

gnuchugnuchu
1,98533 gold badges1515 silver badges88 bronze badges
2
Add a comment |

10 Answers 10


Active Oldest Votes
126
would certainly be a one-aspect list of unicode strings. Beautiful Soup always produces Unicode. So you must transform the list to a single unicode string, and also then convert that to ASCII.

I don"t understand exaxtly exactly how you acquired the one-aspect lists; the contents member would be a list of strings and tags, which is apparently not what you have actually. Assuming that you really constantly acquire a list with a single facet, and that your test is really only ASCII you would use this:

soup<0>.encode("ascii")However before, please double-examine that your information is really ASCII. This is pretty rare. Much more likely it"s latin-1 or utf-8.

soup<0>.encode("latin-1") soup<0>.encode("utf-8")Or you ask Beautiful Soup what the original encoding was and get it back in this encoding:

soup<0>.encode(soup.originalEncoding)
Share
Improve this answer
Follow
edited Mar 1 "09 at 11:40
answered Mar 1 "09 at 11:22

*

oefeoefe
17.5k77 gold badges4141 silver badges6666 bronze badges
2
Add a comment |
27
You most likely have actually a list containing one unicode string. The repr of this is .

You can transform this to a list of byte strings using any type of variation of the following:

# Functional style.print map(lambda x: x.encode("ascii"), my_list)# List understanding.print # Interelaxing if my_list might be a tuple or a string.print type(my_list)(x.encode("ascii") for x in my_list)# What perform I care about the brackets anyway?print ", ".join(repr(x.encode("ascii")) for x in my_list)# That"s actually not a good means of doing it.print " ".join(repr(x).lstrip("u")<1:-1> for x in my_list)
Share
Improve this answer
Follow
answered Mar 1 "09 at 11:40

*

ddaaddaa
48.2k77 gold badges4848 silver badges5656 bronze badges
2
Add a comment |
12
import json, astr = u"name": u"A", u"primary_key": 1ast.literal_eval(json.dumps(r)) will print

"name": "A", "primary_key": 1
Share
Improve this answer
Follow
edited Oct 17 "16 at 16:26

*

Anko
5,25655 gold badges3434 silver badges4242 bronze badges
answered Oct 17 "16 at 15:30
osmjitosmjit
33133 silver badges99 bronze badges
3
Add a comment |
8
If accessing/printing single element lists (e.g., sequentially or filtered):

my_list = # sample elementmy_list = )>
Share
Improve this answer
Follow
answered Feb 9 "13 at 6:21
gevanggevang
4,7342020 silver badges3030 bronze badges
1
Add a comment |
5
pass the output to str() function and it will rerelocate the unicode output u"".also by printing the output it will certainly rerelocate the u"" tags from it.

See more: Red Queen Book Review - Red Queen: Book 1 Book Review


Share
Improve this answer
Follow
edited May 14 at 20:20
answered Apr 28 "13 at 11:14
waweruwaweru
9041212 silver badges1515 bronze badges
Add a comment |
4
is a text depiction of a list that contains a Unicode string on Python 2.

If you run print(some_list) then it is equivalent toprint"<%s>" % ", ".join(map(repr, some_list)) i.e., to develop a text representation of a Python object via the type list, repr() attribute is dubbed for each item.

Don"t confusage a Python object and also its message representation—repr("a") != "a" and also the message depiction of the message representation differs: repr(repr("a")) != repr("a").

repr(obj) returns a string that consists of a printable depiction of an object. Its objective is to be an unambiguous depiction of an item that deserve to be helpful for debugging, in a REPL. Often eval(repr(obj)) == obj.

To stop calling repr(), you might print list items straight (if they are all Unicode strings) e.g.: print ",".join(some_list)—it prints a comma separated list of the strings: String

Do not encode a Unicode string to bytes utilizing a hardcoded character encoding, print Unicode directly rather. Otherwise, the code might fail bereason the encoding can"t recurrent all the characters e.g., if you attempt to use "ascii" encoding with non-ascii characters. Or the code silently produces mojibake (corrupted data is passed further in a pipeline) if the environment offers an encoding that is incompatible via the hardcoded encoding.