i scraping website , after fetching each page, i'm storing page in html file. when store content in html file, stores images , it's eating of storage. there anyway can store files without images?
here code:
for url in xrange(all_urls): driver.get(url) page = driver.page_source f = open(url.replace('/','_') +'.html' , 'w') f.write(page.encode('utf-8')) f.close() time.sleep(uniform(2, 5))
you can use curl , save file html format.
Comments
Post a Comment