Scraping and saving html files without images in Python -

i scraping website , after fetching each page, i'm storing page in html file. when store content in html file, stores images , it's eating of storage. there anyway can store files without images?

here code:

for url in xrange(all_urls):     driver.get(url)        page = driver.page_source     f = open(url.replace('/','_') +'.html'  , 'w')     f.write(page.encode('utf-8'))     f.close()     time.sleep(uniform(2, 5))

you can use curl , save file html format.

test

Search This Blog

Scraping and saving html files without images in Python -

Comments

Post a Comment