🎬 Highest-Grossing Japanese Films Web Scraper

This project is a simple Python web scraping script that extracts data from the Wikipedia page on highest-grossing Japanese films. The script collects movie-related data such as titles, gross revenue, release year, and more, and stores it in a structured format using pandas.

📌 Features

✅ Scrapes movie data from Wikipedia using requests and BeautifulSoup.
✅ Extracts table headers and rows into a clean pandas DataFrame.
✅ Drops unnecessary columns (e.g., notes or references).
✅ Optional: Save the final DataFrame to a .csv file for analysis or visualization.

🧠 How It Works

Sends a GET request to the Wikipedia page using a custom User-Agent.
Parses the page with BeautifulSoup and locates the first HTML <table>.
Extracts column headers and row values.
Cleans the data by dropping the last column.
Saves or prints the result.

🛠️ Requirements

Install the required packages:

pip install requests beautifulsoup4 pandas

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
Web scraping.py		Web scraping.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎬 Highest-Grossing Japanese Films Web Scraper

📌 Features

🧠 How It Works

🛠️ Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎬 Highest-Grossing Japanese Films Web Scraper

📌 Features

🧠 How It Works

🛠️ Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages