Python × Seleniumで始めるスクレイピング｜macOSでの環境構築から動作確認まで

JavaScriptで動的に生成されるWebページを自動取得したいときに力を発揮するのが「Selenium」です。
この記事では、macOS上でPython × Seleniumを使ったスクレイピング環境を構築し、実際にWebサイトへアクセスするまでの流れを解説します。

✅ Seleniumとは？
✅ この記事でできること
1. 必要ライブラリのインストール
2. ChromeDriver のダウンロードとセットアップ
1. 2-1. Chromeのバージョンを確認
2. 2-2. ChromeDriver をダウンロード
3. Gatekeeperのブロック解除（実行できない場合）
4. 動作確認：SeleniumでGoogle検索
5. 実践：食べログ東村山市の店舗一覧を取得
✅ まとめ

✅ Seleniumとは？

Seleniumは、Webブラウザを自動操作できるツールです。
通常のスクレイピングライブラリ（requests + BeautifulSoup）では取得できない「動的に読み込まれるデータ」も扱えるため、実用的な場面で重宝します。

✅ この記事でできること

Python上でSeleniumを動かす環境構築
ChromeDriverのセットアップ（macOS/Apple Silicon）
簡単なブラウザ操作スクリプトの実行
食べログの店舗一覧ページからの情報取得

1. 必要ライブラリのインストール

ターミナルで以下を実行：

pip3 install selenium pandas

pandas は取得データの表形式処理・CSV出力に使用します。

インストールに伴い、Seleniumは以下のような関連ライブラリも自動でインストールします：

ライブラリ名	用途・補足
selenium	Webブラウザ自動制御の本体
websocket-client	WebSocket通信
trio / trio-websocket	非同期通信の制御
attrs, sniffio, outcome 等	trioの依存関係

2. ChromeDriver のダウンロードとセットアップ

SeleniumでChromeを操作するには、Chromeのバージョンに対応した ChromeDriver が必要です。

2-1. Chromeのバージョンを確認

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --version

出力例：

Google Chrome 136.0.7103.114

2-2. ChromeDriver をダウンロード

以下から バージョン136用の mac-arm64 版 を選びます：

🔗 https://googlechromelabs.github.io/chrome-for-testing/

ダウンロード後、解凍された chromedriver を /usr/local/bin/ に配置：

mv chromedriver /usr/local/bin/
chmod +x /usr/local/bin/chromedriver

3. Gatekeeperのブロック解除（実行できない場合）

macOSで「zsh: killed」などと出た場合、以下を実行してセキュリティ属性を削除します：

xattr -d com.apple.quarantine /usr/local/bin/chromedriver

再確認：

chromedriver --version

4. 動作確認：SeleniumでGoogle検索

以下のコードを selenium_test.py として保存：

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time

options = Options()
options.add_argument("user-agent=Mozilla/5.0")

service = Service("/usr/local/bin/chromedriver")
driver = webdriver.Chrome(service=service, options=options)

driver.get("https://www.google.com")
time.sleep(2)

search_box = driver.find_element(By.NAME, "q")
search_box.send_keys("東村山 食べログ")
search_box.submit()

time.sleep(3)
for r in driver.find_elements(By.CSS_SELECTOR, "h3")[:3]:
    print(r.text)

driver.quit()

実行：

python3 selenium_test.py

reCAPTCHAが出る場合は「Google検索」を使わず、目的のURLに直接アクセスする方式に切り替えるのが有効です。

5. 実践：食べログ東村山市の店舗一覧を取得

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time
import pandas as pd

options = Options()
options.add_argument("start-maximized")
options.add_argument("user-agent=Mozilla/5.0")

service = Service("/usr/local/bin/chromedriver")
driver = webdriver.Chrome(service=service, options=options)

url = "https://tabelog.com/tokyo/C13213/rstLst/"
driver.get(url)
time.sleep(3)

shops = driver.find_elements(By.CSS_SELECTOR, "a.list-rst__rst-name-target")
results = [{"店名": shop.text.strip(), "URL": shop.get_attribute("href")} for shop in shops]

df = pd.DataFrame(results)
df.to_csv("tabelog_higashimurayama.csv", index=False, encoding="utf-8-sig")

print("取得件数:", len(results))
driver.quit()

出力されたCSVファイルには、店名とURLが保存されます。

✅ まとめ

Seleniumを使うことで、JavaScriptで描画されるページからも情報を取得可能
ChromeDriverは Google Chromeのバージョンと一致 させる必要あり
macOSではGatekeeperによる制限に注意