[python] BeautifulSoup을 이용한 네이버 이미지 크롤링

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

groti's blog

[python] BeautifulSoup을 이용한 네이버 이미지 크롤링 - naver image crawling 본문

프로그래밍 언어/python

[python] BeautifulSoup을 이용한 네이버 이미지 크롤링 - naver image crawling

groti 2020. 7. 8. 17:17

조코딩 유튜브 채널을 통해 파이썬 언어를 공부하고 있는데요. 관련하여 BeautifulSoup을 이용한 네이버에서 이미지를 크롤링하는 코드를 작성해 보았습니다.

코드

from urllib.request import urlopen
from bs4 import BeautifulSoup as bs
from urllib.parse import quote_plus
from pathlib import Path

baseUrl = 'https://search.naver.com/search.naver?where=image&sm=tab_jum&query='

animal_list = ['dog', 'cat', 'bear']
keyword_list = [['박보검', '임시완'], ['강동원', '이종석'], ['조세호', '안재홍']]

idx = 0

for arr in keyword_list:
    Path('./img/' + animal_list[idx]).mkdir(parents=True, exist_ok=True)
    for keyword in arr:
        Path('./img/' + animal_list[idx] + '/' + keyword).mkdir(parents=True, exist_ok=True)
        print(keyword + ' 검색')
        url = baseUrl + quote_plus(keyword)
        html = urlopen(url)
        soup = bs(html, "html.parser")
        img = soup.find_all(class_='_img', limit=10)
        n = 1
        for i in img:
            imgUrl = i['data-source']
            with urlopen(imgUrl) as f:
                with open('./img/' + animal_list[idx] + '/' + keyword + '/' + keyword + str(n)+'.jpg', 'wb') as h:
                    img = f.read()
                    h.write(img)
            n += 1
    idx += 1
print('다운로드 완료!')