Reddit Scraper - Extract data from Reddit with Python. Efficient, customizable, and API-compliant. Parse and store data for analysis. Community-supported. Open-source license.
This Python script scrapes Reddit for posts related to specified stocks and sectors. It uses PRAW library to interact with Reddit API and performs sentiment analysis using NLTK's SentimentIntensityAnalyzer. The extracted data is stored in CSV files.
- Internet connection is required to install necessary libraries.
- Ensure you have Python and pip installed.
- Clone the repository.
- Navigate to the project folder.
- Run the script:
python main.py
- Set
client_idandclient_secretwith your Reddit API credentials. - Adjust
sectors,locations,start_date, andend_dateas needed.
- Data without post text:
data without post text.csv - Data with post text:
data with post text.csv
- Running the script may take some time due to rate limiting by Reddit API.
- Make sure you have NLTK data downloaded (
nltk.download('vader_lexicon')) before running the script.
This script is for educational and research purposes only. Use responsibly and in compliance with Reddit's API terms of service.