html - Python 3.4 - reading data from a webpage -

- January 15, 2010

i'm trying learn how read webpage, , have tried following:

>>>import urllib.request >>>page = urllib.request.urlopen("http://docs.python-requests.org/en/latest/", data = none) >>>contents = page.read() >>>lines = contents.split('\n')

this gives following error:

traceback (most recent call last):   file "<pyshell#4>", line 1, in <module>     lines = contents.split('\n') typeerror: type str doesn't support buffer api

now assumed reading url pretty similar reading text file, , contents of contents of type str. not case?

when try >>> contents can see contents of contents html document, why doesn't `.split('\n') work? how can make work?

please note i'm splitting @ newline characters can print webpage line line.

following same train of thought, tried contents.readlines() gave error:

traceback (most recent call last):   file "<pyshell#8>", line 1, in <module>     contents.readlines() attributeerror: 'bytes' object has no attribute 'readlines'

is webpage stored in object called 'bytes'?

can explain me happening here? , how read webpage properly?

you need wrap io.textiowrapper() object , encode file (utf-8 universal can change proper encoding too):

import urllib.request import io u = urllib.request.urlopen("http://docs.python-requests.org/en/latest/", data = none) f = io.textiowrapper(u,encoding='utf-8') text = f.read()

Search This Blog

GCM

html - Python 3.4 - reading data from a webpage -

Comments

Post a Comment

Popular posts from this blog

android - Hide only the Action bar on Scroll not action bar tabs -

matlab - "Contour not rendered for non-finite ZData" -

delphi - Indy UDP Read Contents of Adata -