Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 37 additions & 46 deletions ReadMe.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# KEML Analysis

**Note:** This branch features an alternative [KEML](https://github.com/keml-group/) component that leverages a logic-based argumentation framework (**LAF**), and is currently *only* tested for use in conjunction with other LAF components of KEML. For the corresponding base version of this component, see the [base analysis](https://github.com/keml-group/keml.analysis) repository.
-----------------------
This project analyses KEML files statistically. For each KEML file it produces:
1) [General Statistics](#general-statistics)
2) [Argumentation Statistics](#argumentation-statistics)
2) [Logical Argumentation](#logical-argumentation)
3) [Trust Scores](#trust-scores)

## Installation
Expand All @@ -13,15 +15,15 @@ If you freshly added maven to this project in Eclipse, it might be necessary to
## Running

This project is a basic maven based java application you can run in all normal ways (command line, IDE...).
It has one optional input: the base folder. If none is given, it creates statistics on the introductory example from keml.sample - assuming that project is located on the same level as keml.sample.
It has one optional input: the base folder. If none is given, it creates statistics on the LAF examples from the LAF branch of keml.sample - assuming that project is located on the same level as keml.sample.
All output files are stored in the folder **analysis**.

## Output
In **analysis**, each filename starts with a prefix _pre_ that is equal to the KEML file name.

Currently, three types of statistics are generated:
1) [General Statistics](#general-statistics)
2) [Argumentation Statistics](#argumentation-statistics)
2) [Logical Argumentation](#logical-argumentation)
3) [Trust Scores](#trust-scores)

### General Statistics
Expand All @@ -31,66 +33,55 @@ This CSV file holds a Message Part and a Knowledge Part where it gives statistic
The Message Part gives counts for sends and receives, as well as interruptions.
The Knowledge Part counts PreKnowledge and New information, split into Facts and Instructions. It also counts repetitions.

![Example General Statistics](doc/example-general-csv-2.png)
![Example General Statistics](doc/laf_example-general-csv.PNG)


### Argumentation Statistics
Argumentation statistics are stored under _pre_-arguments.csv.
### Logical Argumentation
Argumentation data are stored under _pre_-arguments.csv.

This CSV file consists of a table that counts attacks and supports between facts (F) and instructions (I) of all conversation partners (including the human author).
This CSV file consists of multiple relevant data to the state of the analysied converastion:
- A mapping of every modeled piece of information and a unique literal-symbol assigned to it (e.g., "L1").
- A list of all derived logic arguments for every literal/information piece.
- A String-representation of the constructed undercut trees.
- A list of all rebuttals found for each literal/information piece

![Example Argumentation Statistics](doc/example-arguments-csv.png)

![Example Argumentation Info and Literal Mapping](doc/laf_example-arguments-output1.PNG)
![Example Derived Logic Arguments](doc/laf_example-arguments-output2.PNG)
![Example Constructed Undercut Trees](doc/laf_example-arguments-undercuts.PNG)
![Example Found Rebuttals](doc/laf_example-arguments-rebuttals.PNG)

### Trust Scores

Trust Scores are given as Excel (xlsx) files _pre_-w _n_--arguments.csv where _n_ is the weight of the trust computation formula.
Each file depicts four scenarios (a-d) described under [Initial Trust](#initial-trust).
Each scenario consists of two columns, one (iT) that lists the initial trust score for each information and one (T) that lists the (final) trust score.
Additionally, there are columns to describe the information i precisely:
Trust Scores are given as Excel (xlsx) files _pre_-scores-trust.xlsx. As opposed to the baseline version which uses initial trust to model the influence of attacks and supports on information pieces, the approach of this LAF-version leverages the count of arguments for and against a given claim using categorizer and accumulator functions on argumentation structures in logic-based argumentation as introduced and discussed in the paper [A logic-based theory of deductive theory](https://doi.org/10.1016/S0004-3702%2801%2900071-6) by P. Besnard and A. Hunter. The goal is to model trust in a given based on how many arguments can be made for it or against it.

The .xlsx file showcases [categorization](#categorization) and [accumulation](#accumulation) values for each information, considering the logical arguments that could be derived for it and against it, the logical arguments that undercut it, and the logical arguments that rebut it.
Additionally the file depicts the following data:
1) The **time stamp** (-1 for pre knowledge) with the background color stating whether i is fact (green) or instruction (orange)
2) The **message** column with the background color blue for LLM messages and yellow for all other messages
3) The **argument count \#Arg** counting how many other information influence i directly
4) The **repetition count \#Rep** counting the number of repetitions of i

![Example Trust Scores](doc/example-trust-xlsx.png)


#### Trust computation formula
**Trust T** into an **information i** is computed based on **initial trust $T_{init}$** by combining it with a **repetition score $T_{rep}$** and an **argumentative trust $T_{arg}$**:

$T(i)= restrict(T_{init}(i) + T_{rep}(i) + w*T_{arg}(i))$

Here, restrict limits the computed trust to a value in [-1.0,... 1.0].
The weight $w$ is a natural number that controls the emphasis of $T_{arg}$. The analysis currently runs for $w\in[2,... 10]$.

#### Repetition Score

The phenomenon that someone trusts more into an information the more often it was heared is known as **(illusiory) truth effect**.
We compute it as the of proportion of repetitions of the information $i$ $rep(i)$ to all receive messages $receives$:

$T_{rep}(i) = rep(i)/receives$

The repetition score can only contribute positively to our trust and we have $T_{rep} \in [0,.. 1.0]$.

#### Argumentative Trust
3) The **argument count \#Arg+** count of how many arguments exist for the information piece
4) The **argument count \#Arg-** count of how many arguments exist against the information piece

The argumentative trust $T_{arg}(i)$ is computed from all trust scores $T(j)$ where _j_ has an argumentative impact (that is an immediate connection $j$->$i$) on _i_:
![Example Trust Scores](doc/laf_example-trust-xlsx.PNG)

$T_{arg}(i) = \sum_{j\in impact(i)} infl(j,i)*T(j)$

Here, $infl(j,i)$ is defined by the type of edge $j$->$i$ as -1, -0.5, 0.5, 1 for strong attacks, attacks, supports and strong supports, respectively.
#### Categorization
The goal is to assign a numerical value to argument trees based on the amount of arguments that attack it (i.e., children), attackers of attackers (i.e., children of children) and so on recursively.
A specific version of this function is used to that end, namely the hCategorizer:

#### Initial Trust
$h(N) = \frac{1}{1+ h(N_1)+...+ h(N_l)}$, where $N$ is the root argument and $N_1,..., N_l$ are children of the $N$

The initial trust into an information _i_ could be assigned individually to each information. In this analysis module, it is currently evaluated in **four scenarios** that distinguish between the LLM _LLM_ and all other conversation partners _P_:
In the .xlsx file we use hCat+ to refer to the categorization values of arguments for a given claim, and hCat- for the categorization values of arguments against a given claim.

- a) trust all completely ($T_{init}(P) = 1$; $T_{init}(LLM)=1$)
- b) trust the LLM less ($T_{init}(P) = 1$; $T_{init}(LLM)=0.5$)
- c) trust the LLM more than others ($T_{init}(P) = 0.5$; $T_{init}(LLM)=1$)
- d) limit trust into all ($T_{init}(P) = 0.5$; $T_{init}(LLM)=0.5$)
#### Accumulation

We write $T_{init}(P)$ for { $T_{init}(i) | i$ from $p \in P$} and $T_{init}(LLM)$ for { $T_{init}(i) | i$ from $LLM$}.
Using categorization values assigned to arguments for and against a given claim, an accumulator function aggregates these values to compute a balance. A specific accumulator function is used, namely the logAccumulator:
$logAccu(X,Y) = log(1 + X_1 + ... + X_l) - log(1 + Y_1 + ... + Y_l)$, where $X_1,...,X_l$ is the list of categorized arguments for a claim and $Y_1,...,Y_l$ are that of arguments against the claim.

The resulting accumulation value can be interpreted as follows:
- $logAccu(X,Y) > 0$ : indicates that the arguments for the claim in question are stronger. The higher the value, the more trustworthy the claim is on basis of the count of unchallenged/equally challenged arguments for it.
- $logAccu(X,Y) = 0$ : indicates a neutral claim.
- $logAccu(X,Y) < 0$ indicates that the arguments against the claim in question are stronger. The lower the value, the less trustworthy the claim is on basis of the count of unchallenged/equally challenged arguments against it.

## License
The license of this project is that of the [group](https://github.com/keml-group).
Binary file added doc/laf_example-arguments-output1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/laf_example-arguments-output2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/laf_example-arguments-rebuttals.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/laf_example-arguments-undercuts.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/laf_example-general-csv.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/laf_example-trust-xlsx.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 22 additions & 10 deletions src/keml/analysis/AnalysisProvider.java
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,15 @@ public static void main(String[] args) throws Exception {

String folder;
if (args.length == 0) {
folder = "../keml.sample/introductoryExamples";
folder = "../keml.sample/LAFExamples";
} else {
folder = args[0];
}

/* Logic-based samples/convos should have a way to distinguish them.
* In this case: "LAF" in the name of the folder */
boolean logicBased = folder.contains("LAF");

File sourceFolder = new File(folder + "/keml/");
File targetFolder = new File(folder+ "/analysis/");

Expand All @@ -42,17 +46,25 @@ public static void main(String[] args) throws Exception {
Conversation conv = new KemlFileHandler().loadKeml(source);

String basePath = targetFolder +"/" + FilenameUtils.removeExtension(file.getName());

new ConversationAnalyser(conv).createCSVs(basePath);
LocaleUtil.setUserLocale(Locale.US);

for(int i = 2; i<= 10; i++) {
TrustEvaluator trusty = new TrustEvaluator(conv, i);
trusty.writeRowAnalysis(
basePath+"-w"+i+"-",
TrustEvaluator.standardTrustConfigurations(conv.getConversationPartners()),
1.0F);
if (logicBased) { // separate logic-based and base frameworks' analyses
LAFConversationAnalyser as = new LAFConversationAnalyser(conv);
new ConversationAnalyser(conv).writeGeneralCSV(basePath + "-general.csv");
as.writeLogicArgumentationCSV(basePath + "-arguments.csv");
as.scoresMatrix(basePath + "-scores.xlsx");
} else {
new ConversationAnalyser(conv).createCSVs(basePath);
LocaleUtil.setUserLocale(Locale.US);

for(int i = 2; i<= 10; i++) {
TrustEvaluator trusty = new TrustEvaluator(conv, i);
trusty.writeRowAnalysis(
basePath+"-w"+i+"-",
TrustEvaluator.standardTrustConfigurations(conv.getConversationPartners()),
1.0F);
}
}

} catch (Exception e) {
e.printStackTrace();
}
Expand Down
76 changes: 76 additions & 0 deletions src/keml/analysis/ArgumentTree.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
package keml.analysis;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

import keml.Literal;

/**
* Tree-like data structure to facilitate the construction of Argument Trees.
* mainly used for undercuts between {@link LogicArgument}s.
*/
public class ArgumentTree {
/** root {@link LogicArgument} of the current tree*/
LogicArgument root;
/** List of children of the root, that themselves are other trees*/
List<ArgumentTree> children;

/**
* Constructor for {@link ArgumentTree}
* @param root {@link LogicArgument}
*/
public ArgumentTree(LogicArgument root) {
this.root = root;
this.children = new ArrayList<>();
}


/**
* adds a child to this tree
* @param at {@link ArgumentTree}
*/
public void addChild (ArgumentTree at) {
this.children.add(at);
}


/**
* getter for the root of this tree
* @return {@link ArgumentTree#root}
*/
public LogicArgument getRoot() {
return root;
}


/**
* getter for the children of this root
* @return {@link ArgumentTree#children}
*/
public List<ArgumentTree> getChildren() {
return children;
}



/**
* a <b>static</b> method that creates a String representation of a given {@link ArgumentTree}
* @param node {@link ArgumentTree} to be represented
* @param prefix String prefix to facilitate recursive call. Should be empty ("") during first call

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this? I think the helper is handling the prefix itself. Do you need to go back and call this method from the helper?

* @param literals2String Map of literals and their associated String symbol. See {@link LAFConversationAnalyser#literals2String}
* @param tree {@link StringBuilder} of the tree constructed so far (should be empty initially)
* @return String representation of a given tree

*/
public static String printTree(ArgumentTree node, String prefix, Map<Literal, String> literals2String, StringBuilder tree) {
tree.append(prefix).append("└── ").append(LogicArgument.asString(node.getRoot(), literals2String)).append("\n");

for (int i = 0; i < node.getChildren().size(); i++) {
printTree(node.getChildren().get(i), prefix + " ", literals2String, tree);
}

return tree.toString();
}

}
Loading