dimanche 19 juin 2016

How to get complete link of an image with python?


I am trying to make a crawler which goes to the web page and downloads all the images available on that page. My code looks like this

import random
import urllib.request
import requests
from bs4 import BeautifulSoup

def get_images(url):
    code = requests.get(url)
    text = code.text
    soup = BeautifulSoup(text)
    for img in soup.findAll('img'):
        src = img.get('src')
        download_image(src)


def download_image(url):
    name = random.randrange(1, 100)
    image_name = str(name) + ".jpg"
    urllib.request.urlretrieve(url, image_name)

get_images("http://www.any_url.com/")

Now many images usually don't contain their full URL in their src tag. Now, my question is that how can i get full URL of the images so that i can download them ?


Aucun commentaire:

Enregistrer un commentaire