Bipartite Graph for Political Debates

A study project for extracting speaker data from British parliamentary debates on Yemen and represent them in a Neo4j database. All scripts are written for Python 3.12. Required dependencies are listed in the requirements.txt file.

1 Collect: Corpus & Data

For our project we used the ParlaMint-GB corpus which is a linguistically annotated corpus of British parliamentary speeches between 2015 and 2022. It has a TEI-XML format and consists of three different file types: debate files, a speaker metadata file and a party metadata file.

The raw corpus itself is due to its size not included in this repository. For running the scripts download the ParlaMint-GB corpus and save it in data/raw. Then just run the scripts in the order in which they are discussed in the following sections.

The data directory contains the extracted XML files and corresponding Cypher queries for each debate. The scripts directory contains all scripts used for the project.

2 Prepare: XML

The yemen_debates.py script extracts all debate files from the corpus that contain discussions about Yemen. The XSLT_processor.py script is used to apply an XSL transformation from XSLT.xsl to the Yemen debate files and transform them into structured XML files for further processing.

3 Access: Neo4j

The queries.py script creates Cypher query files based on the resulting XML files from section 2. The database.py script creates the database entries by executing those query files.

Finally, the speaker2speaker_projection.cypher file contains a Cypher query for creating and returning a speaker-to-speaker projection of the bipartite graph. This is done by creating a new edge relation between speakers who participated in the same debate. Then only the speaker nodes with this CO_DEBATED_WITH relation are retrieved.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
scripts		scripts
.gitignore		.gitignore
Political Debates Relations.pdf		Political Debates Relations.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bipartite Graph for Political Debates

1 Collect: Corpus & Data

2 Prepare: XML

3 Access: Neo4j

About

Uh oh!

Releases

Packages

Languages

SojaSurfer/Text_Technology

Folders and files

Latest commit

History

Repository files navigation

Bipartite Graph for Political Debates

1 Collect: Corpus & Data

2 Prepare: XML

3 Access: Neo4j

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages