Skip to content

Enhance Meetup sync: past-event crawl, parsing fixes, debug logging, and workflow updates#18

Merged
carloshvp merged 2 commits intomainfrom
codex/find-workflow-for-updating-events-xmkpj4
Apr 3, 2026
Merged

Enhance Meetup sync: past-event crawl, parsing fixes, debug logging, and workflow updates#18
carloshvp merged 2 commits intomainfrom
codex/find-workflow-for-updating-events-xmkpj4

Conversation

@carloshvp
Copy link
Copy Markdown
Member

Motivation

  • Improve completeness and reliability of Meetup event data by including recent past events and more robust parsing of iCal/JSON-LD sources.
  • Make debugging easier when running the sync script in CI by emitting diagnostics.
  • Expand when the GitHub Actions job runs to include pull requests and avoid committing from PR or workflow runs.

Description

  • Add MEETUP_PAST_EVENTS_URL and MEETUP_SYNC_DEBUG options, and wire MEETUP_SYNC_DEBUG into the script to emit debug logs.
  • Improve iCal handling by unescaping ical text (unescape_ical_text) and preserving multi-line values, and add debug counters for parsed iCal events.
  • Extend JSON-LD parsing to handle @graph nodes and extract performer names more reliably, with debug output for parsed counts.
  • Add a past-events crawl: extract event URLs from the past-events page (extract_event_urls_from_html), fetch event detail pages, parse their JSON-LD, and merge/dedupe results with merge_events to combine iCal and past-event data.
  • Add fetch_url helper with standardized headers, timeouts, and debug payload size logging.
  • Update fetch_events flow to try iCal first, then supplement with past-event crawl, then fallback to the main events page; raise consolidated errors when all sources fail.
  • Update the GitHub Actions workflow to run on pull_request, set MEETUP_SYNC_DEBUG: '1' in the job, refine the commit step to only run for push, schedule, or workflow_dispatch, and enhance stats printed after sync to show the most recent past event and its URL.
  • Document new environment variables and workflow behavior in README.md.
  • Add unit tests in tests/test_sync_meetup_events.py covering URL extraction, JSON-LD @graph support, and JSON-escaped URLs.

Testing

  • Ran the new unit tests with python -m unittest tests/test_sync_meetup_events.py, and all tests passed.
  • Executed the sync script locally to validate iCal parse and merged output with MEETUP_SYNC_DEBUG=1, and observed expected debug and log output (no automated failures).

Codex Task

@carloshvp carloshvp merged commit fa278c5 into main Apr 3, 2026
1 check passed
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

if merged_events:
return merged_events

P1 Badge Fallback to events page when iCal fetch yields no events

Returning immediately on any non-empty merged_events skips the main events-page fallback in cases where iCal fails/comes back empty but past-event crawling still finds historical items. In that scenario the sync publishes only past events and drops upcoming events that are still available on MEETUP_EVENTS_URL, which is a functional regression from the prior flow that always tried the events page after an iCal miss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant