In Python 3, UTF-8 is the default encoding. In windows you may run into issues where characters will not decode such as 'utf-8' codec can't decode byte 0x92. Use a header with either cp1252 or windows-1252
# coding=CP1252
import re
text = """
LADY CAPULET
Alack the day. She’s dead, she’s dead, she’s dead!
CAPULET
Ha! Let me see her. Out, alas! She’s cold.
Her blood is settled, and her joints are stiff.
Life and these lips have long been separated.
Death lies on her like an untimely frost
Upon the sweetest flower of all the field.
Nurse
O lamentable day!
"""
No comments:
Post a Comment