python - extract info from a string -
the following code working, not able extract information need. can use soup or need regular expression? from bs4 import beautifulsoup import urllib2 mynumber='1234567890' url="http://www.nccptrai.gov.in/nccpregistry/savesearchsub.misc?phoneno="+mynumber page=urllib2.urlopen(url) soup = beautifulsoup(page.read()) table = soup.findall("table")[1] myl=[item.text.strip() item in table.find_all('td')] import re re.findall(r'is:\s*[^,]*' , myl[1]) the expected output 4 parameters mentioned in first string of first slice. ['2014-08-07 15:50:00', 'andhra pradesh', 'unitech', '0'] (note date changed y-m-d) the string returned looks this... [u'is:\n 31-10-2009 01:11\n\n\nservice area : \n mumbai\n\n\nservice provider :\n idea\n\n\n\n\n\nyour preference :0'] i'd rely on the number registered in ncpr header (it in td tag class gridheader ) , next rows via find_next_siblings() : ...