방탄소년단, 블랙핑크 맴버별 사진 다운로드 프로그램 => Crawling BTS & Black Pink member's pictures (with Python, Selenium)

Python

방탄소년단, 블랙핑크 맴버별 사진 다운로드 프로그램 => Crawling BTS & Black Pink member's pictures (with Python, Selenium)

EasyCoding 2021. 1. 4. 21:36

728x90

1. pip install selenium

2. download & copy "chromedriver.exe"

3. Run below code : Python kpop_crawling.py

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import urllib.request
import os

kpop_dict = {
"BTS" : ["RM", "Jin", "Suga", "J-Hope", "Jimin", "V","Jungkook"],
"Black Pink" : ["Jisoo", "Jennie", "Rosé", "Lisa"]
}

def crawling(target_name):
    driver.get("https://www.google.co.kr/imghp?hl=ko&tab=wi&ogbl")
    elem = driver.find_element_by_name("q")
    elem.send_keys(target_name)
    elem.send_keys(Keys.RETURN)
    SCROLL_PAUSE_TIME = 3  #Increase this number if your network is slow
    NUMBER_OF_PICTURES = 50 #Increase this number if you want to get more pictures
    # Get scroll height
    last_height = driver.execute_script("return document.body.scrollHeight")

    
    count = 0
    while count<NUMBER_OF_PICTURES:
    #while True:
        # Scroll down to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        # Wait to load page
        time.sleep(SCROLL_PAUSE_TIME)
      

        # Calculate new scroll height and compare with last scroll height
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:
            try:
                driver.find_element_by_css_selector(".mye4qd").click()
            except:
                break
        last_height = new_height

        images = driver.find_elements_by_css_selector(".rg_i.Q4LuWd")

        

        for image in images:
            try:
                image.click()
                time.sleep(2)
                imgUrl = driver.find_element_by_xpath('/html/body/div[2]/c-wiz/div[3]/div[2]/div[3]/div/div/div[3]/div[2]/c-wiz/div[1]/div[1]/div/div[2]/a/img').get_attribute("src")
                # urllib.request.urlretrieve(imgUrl, os.path.join('./'+ target_name +'/', i + str(count) + ".jpg"))
                urllib.request.urlretrieve(imgUrl, target_name + str(count) + ".jpg")
                count = count+1
                if count>=(NUMBER_OF_PICTURES+1):
                    break
            except:
                pass

    

driver = webdriver.Chrome()
for key in kpop_dict:
    os.mkdir(key)
    os.chdir(key)
    for val in kpop_dict[key]:
        os.mkdir(val)
        os.chdir(val)
        crawling(val)
        os.chdir('..')
    os.chdir('..')
driver.close()

저작자표시

'Python' 카테고리의 다른 글

Python 정리 노트 (0)	2021.11.24
연도별 디렉터리 만들어서 파일 옮기는 코드 (0)	2021.02.08
연예인 사진을 다운로드 받으면서 동시에 얼굴만 오려내서 따로 저장하는 코드 (0)	2021.01.22
python web server (0)	2021.01.04

현재글방탄소년단, 블랙핑크 맴버별 사진 다운로드 프로그램 => Crawling BTS & Black Pink member's pictures (with Python, Selenium)

EasyCoding

누구나 따라하면서 배울 수 있는 쉬운 코딩 및 인공지능 강좌를 연재 합니다

windows10, smart mirror, 클라우드피크2, 무료관상, 덕적도, raspberrypi, smartmirror, Magic Mirror, faceDetect, Magicmirror, 오인페, 인공지능관상, VisualStudio, 윈도우10, 구글홈미니, pm8003, Yolov5, 네이쳐하이크, OpenCV, 굴업도,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

EasyCoding

방탄소년단, 블랙핑크 맴버별 사진 다운로드 프로그램 => Crawling BTS & Black Pink member's pictures (with Python, Selenium)

'Python' 카테고리의 다른 글

'Python'의 다른글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

방탄소년단, 블랙핑크 맴버별 사진 다운로드 프로그램 => Crawling BTS & Black Pink member's pictures (with Python, Selenium)

'Python' 카테고리의 다른 글

'Python'의 다른글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역